Executive Summary

Working with Big Data

By Sara Royster

The purpose of this periodical article is to describe the work with big data to introduce what big data is, an overview of big data work, challenges that big data work contains, and how to prepare for big data work.

What is Big Data?

Big data is defined as a collection of large data sets that cannot be analyzed with normal statistical methods. The data can be anything from number, videos, pictures, word, phrases, and so on. There are two types of big data: structured and unstructured. Structured data are numbers and words that can be categorized and analyzed easily, unlike unstructured data which is more complex information that cannot be separated into categories or analyzed easily. Therefore, the analysis of unstructured data relies on key words so the data can be filtered using terms. Specific kinds of big data can be bussiness, E-commerce, finance, government, health care, and others.

Working with Big Data

Some of the work with big data is automated, but workers are still involved in collecting, processing, and analyzing the information. Those who work with big data are known as data scientists for the most part, and the U.S. Bureau of Labor Statistics (BSL) classifies them as statisticians, computer programmers, or other titles depending on what their task is. These workers study big data using conventional and newly developed statistical methods. The workers use computer programs and algorithms to detect patterns or to find usable information. Once they have the data, the workers find a method of storing the big data. Then they process and clean the data. To make the analysis easier, workers often work with a manager to determine what data is irrelevant so they can remove the irrelevant data from the relevant data. Workers also consult with computer programmers to write the code that is used to analyze the data. After analyzing, they create graphics or tables to summarize the data. The growth of work with big data will expand the capability for others to use the information.

Challenges Presented by Big Data

The growth of big data has presented new challenges to the people who work with it. One of the biggest challenges is the availability of funding becuase big data is new a phenomenon, so the funds used for big data are often targeted due to the economy we live in today. Another challenge is storing the big data because the data can require hundreds of servers. Finding data that can actually be used is also a challenge due to the large volume of data, and the fact that it is very time consuming. To ensure the big data is measuring what is meant to be measured can also be a challenge because it can be unclear of how the data should be interpreted due to the large amount of unstructured data. Another challenge would be the question of who owns the data. This question has still not been answered due to the fact that few laws exist to resolve it. A last challenge is how to protect and control the data because analysts can be responsible for finding ways to keep the data secure.

Preparing to Work with Big Data

In addition to having a bachelor's degree, many worker with big data take further schooling. Courses such as mathematics, statistics, and computer programming prepares students to work with big data. Math helps with logical thinking and problem solving, statistics provides analytical knowledge, and computer programming is a must have to work with big data. Some workers may need education in the industry they work in if it is a highly technical industry. Work experience may be needed also. The workers need to stay up to date with the fast-changing world of big data too. Some skills that people may need to work with big data are problem solving skills, communication skills, teamwork, and the most importantly, curiostity. Analysts have to come up with new ways of doing things which is why problem solving skills are important, and workers need to be able to clearly explain their results which is why communication skills are important. Teamwork is important because work is usually spread among teams of analysts. Curiosity is most important because technology is always changing so you must be willing to learn on the go.