Executive Summary
"Working with Big Data" by Sara Foster

Recently there has been an influx of data being generated in the world, referred to as “big data”, leading to the need for extensive analysis of that data. The following is meant to further describe the aspects of working with “big data”.

In order for data to be referred to as big data, it must be large enough that one cannot use normal statistical methods to analyze. There are two types of big data: structured data, which is easily categorized data like numbers and words and unstructured data, which includes more complex information like photos and reviews. The growth of the internet in recent years has led to a growth in unstructured data, which is more complicated to analyze.

Workers who analyze big data are normally referred to as data scientists, but could have many other names as well. These workers use common statistical methods as well as new ones to analyze big data. Many of the new methods involve computer programs that do much of the analyzing because the files are too big for humans. Prior to the analysis workers must figure out a way to collect the data, store the data, and eliminate unnecessary information from the data. After the data is analyzed, graphics can be made to represent the data in a more effective way.

Depending on the source of the big data, the job task may differ. However, sometimes data from one source can be used by another source or simultaneously with the data of that other source. Some sources of big data that are using analyses, or some form of one, are: businesses, e-commerce, finance, government, healthcare, science, social networking, telecommunication, politics, utilities, and smart meters. The wages for the analyses, which could be statisticians or computer programmers or some other occupation, differ from the median of a statistician being $75,560 and a computer programmer being $74,280. Both positions are projected to have a sufficient amount of job growth by 2020.

There are many challenges to overcome when working with big data. The biggest is finding the money to work with big data. Another challenge is finding a way to store the data, being that it is in very large volumes. Because the volume of data is so big, there is chance of unnecessary data being included, thus leading to the challenge of figuring out the data that is unusable. Sometimes, especially with unstructured data, one can interpret the information wrong. Interpreting information incorrectly poses the problem of whether or not one is accurately measuring what it is meant to measure. When it comes to law, there are a few problems when dealing with big data. The first is who actually owns the data. The second is how to protect and control the data after it’s been collected. All these challenges are included in working with big data.

Finding someone that has the skills to perform the tasks associated with big data can be a challenge because there is a lack of workers. Big data work doesn’t just require knowledge in statistics or computer programming; it’s advised that you have a background in whatever source you’re working with as well. Degrees that are common amongst big data workers are mathematics, statistics, and computer programming. It is also very common for big data workers to receive higher than just a bachelor’s degree. As well as advanced degrees, big data workers should be familiar and educated in the industry they’re working with. Some skills that are looked for in big data workers are problem solving, communication, teamwork, and intellectual curiosity.