What is Big Data?

31 Aug 2017

Big Data is everywhere!

Big data is an important computer topic because of the massive amounts of data that is being collected from companies and places like Google, Amazon, and even the National Security Agency. It is critical to develop algorithms to handle big data. Many applications use big data like the Internet of Things.

According to Google, big data is extremely large data sets that may be analyzed computationally to reveal patterns, trends, and association, especially relating to human behavior and interactions. These data sets are so large or complex that traditional data processing application software is inadequate to deal with them. Some forms of these data sets include capturing, storing, analysing, data curation, searching, sharing, transferring, visualizing, querying, updating and information privacy. Big data is known for when datasets are not able to be capture, curate, manage, and processed in a tolerable time frame because of its size and cannot be used with common software tools. Some characteristics that describe big data are volume, variety, velocity, variability, and veracity. Volume is the amount of generated and stored data. This helps determine the value and its potential of where it can be called big data or not. Variety is the kind of data that is given and is used to help with analysis to effectively use the resulting insight from the big data. Velocity is the speed that the data is generated and processed to meet the demands and challenges that can come up in the path of growth and development. Variability is when there is an inconsistency with the data set which can hamper the processes to hand and manage it. These are some of the characteristic of big data but when you have factory work or cyber-physical systems, these characteristics can expand to connection which regards to sensor and networks, cloud which regards to computing and data on demand, cyber which regard to model and memory, content/context which regards to meaning and correlation, community which regards to sharing and collaboration, and customization which regards to personalization and value. In order to process data, advanced tools like analytics and algorithms are used to reveal meaning information.

How do we use them?

Applications of big data include banking and securities, communications, media and entertainment, government, and the internet of things.

In banking and securities, big data is used to monitor financial market activity. The Securities Exchange Commision (SEC) uses the network analytics and natural language processors to catch illegal trading activity in the financial markets.

In the communications, media and entertainment application, big data is used to pass along information collected to simultaneously analyze consumer data as well as behavioral data to create detail profiles that can tell consumer based stores what to do for different targeted audiences, recommend content on demand and measure content performance.

An example of this is Spotify. Spotify is an on-demand music service that uses Hadoop big data analytics, which will be explained in a later section, to gather data from all it’s users to give music recommendation to each individual user.

Lets Dive Deeper

Internet of Things (IoT) is explained by aggregating and compressing massive amounts of low latency/ low duration/ high volume machine-generated data coming from a wide variety of sensors to support real-time use cases such as operational optimization, real-time ad bidding, fraud detection and security breach detection. The biggest difference between IoT and big data is that IoT analytics have to include streaming data management and edge analytics. WIth these two things and everything that big data does, it makes up IoT. Without the two analytics, we simply just have bid data.

Data management is enabling the ability to ingest, aggregate and compress any real time data that is collected. Edge analytics is automatically analysing real time data and makes real time decisions to optimize operational performance. Applications of IoT includes smart homes and wearable. Smart homes uses a lot of real time interaction to be able to provide a good user feedback. Say you touch a button and it dims all the lights. This is consider IoT because of the real time interaction with a real time response. Wearables are also consider IoT because it is able to provide a lot of user feedback from data that it gathers like your heart rate or how much you have exercised and it is able to track a lot of moment on a real time base to provide more real time user feedback.

How do engineers use them?

Because there is such massive amounts of data, the way to analyzing these data has to be computer-driven. With the help of these algorithms, the possibility of being able to statistically model data is possible. There are a number of algorithms that can help us with this.

The algorithms are chosen based on the goals that have been established beforehand. This is equivalent to when a statistician chooses the appropriate strategy to go about a problem but with a lot of data this is all fully automated. The algorithms were designed to address different problems in different areas of interest. There are also many algorithms to choose from and some are clearly better for something than other and it is sometimes useful to try more than one. By trying multiple algorithms, you can provide comparisons and often yield some unexpected results that can tell you more about the product or the customers.

Okay!

Big data is important for us to know because it is able to reduce cost, it is faster and better at decision making, and it can yield us new products and services. Big data technologies that are cloud based will bring significant cost advantages when it comes to storing large amounts of data.

Combining the ability to analyse new sources of data, businesses are able to analyse information immediate and make them based on what they have learned in real time. With the new ability to gauge consumer needs and satisfaction by analysis will give us the power to give customers what they want in return of new products and services.