Seth Taylor

Executive Summary

This article discusses the increasing amount of data in the world known as the “big data” phenomenon.

What is big data?

Big data is a collection of large datasets that are too big to be analyzed using traditional statistical methods. These datasets are measured in exabytes because of their extremely large size. The data can come from a variety of sources like pictures, videos, maps, words and phrases. Examples of big data are customer reviews, websites, comments on social networks, electronic medical records and bank records.

There are two types of big data: structured and unstructured. Structured data can be easily categorized and analyzed and consist of numbers and words. Examples of this data would be network sensors in electronic devices, smartphones, and GPS systems. Unstructured data is more complex and cannot be easily categorized or analyzed. Examples of this data are customer reviews, photos, and comments on social networks.

Working with big data

Big data can be analyzed using computer programs and algorithms. Someone who works with big data is called a data scientist. They study the data and use multiple computer programs and algorithms to sort through the data and find patterns, then create graphs, charts, or tables to summarize the results. They also have to find a way to store the data since traditional storage methods are too small for big data.

Big data can be found in almost any field of work and the tasks of data analysts differ across each field. The following are examples of big data in different fields of work.
Business: Purchase data and customer reviews are analyzed to develop new products and determine where improvements should be made.
E-commerce: Purchase data, customer reviews, comments and suggestions are analyzed to improve a company’s website and make browsing easier for the customer.
Finance: Analysts study transaction data, account data, credit and debit transactions, and financial market data to find security breaches and fraud.
Healthcare: Patient data and even videos of surgeries are all big data. Analysts use social networking to identify disease outbreaks in real time. Information of DNA projects are used to develop drugs specific to an individual’s genetic makeup.
Science: Collection, transportation and storage of scientific data creates vast amounts of data sets. Analysts collect the data on site at the experiment and ship it to a lab to be analyzed.
Social Networking: Analysts collect massive amounts of pictures, comments, and videos and sort through them using key terms. This information makes target advertising easier and helps businesses make better products.
Telecommunications: Using data collected from cellphones analysts can tailor a phone to the user’s personal preferences. This information can also help reduce dropped calls and prevent other problems.
Other: Big data is also used in politics, appliances and utilities

Challenges presented by big data

Data analysts face many challenges when working with big data and it is their job to find solutions to the problems. Funding for big data analyst software has been limited, especially during the recent recession. Another challenge is storing big data. Hundreds of servers are required to store and process the information and it needs to be easily accessible. Other challenges include determining usable data versus the unusable data, accuracy of the data measurement, ownership of the data, and protection/control of the data.

Preparing to work with big data

Most data analysts have their masters or higher degree and typically specialize in math, statistics or computer science. They take courses such as math, statistics, computer programming and some engineering courses. In highly technical fields it is important for workers to have education in their industry as well.

Data analysts are problem solvers and must be creative when searching for solutions. Some necessary skills to be successful are communication skills, teamwork, and curiosity.