Data Aggregation
Data Collection Methods
Data can be collected in many ways such as video, audio, written, oral, etc. The source of the data partly determines the usability and usefulness of data. Data collection through a physical poll may be inaccurate compared to an anonymous online poll, for example.
Unstructured vs Structured data
There are two main states of data: unstructured and structured. Unstructured data is information that has not been organized into meaningful categories. An example would be a copy of the last 100 receipts of a store. All the information is there, it just has to be categorized. Structured data is information that has been organized into categories that are useful. If the receipts were organized by price or number of items sold, the data set would be classified as structured.
Extraction
Data can be extracted in a number of ways. Raw data can be published and extracted using tables and graphs. Using the graphs and tables, patterns can be found showing why the data is within a certain range. Digital robots can search the internet copying information from websites. This is know as screen scraping, finding data patterns using existing data sets.
Internet's Data Structure
The internet is a vast, unstructured place. In order to find specific websites and data, search engines are used to help navigate the internet. Different search engines have advantages and disadvantages within their algorithms. These algorithms can favor certain websites if they are more popular or promoted while not considering the website's content.
Storage, data persistence, privacy vs utility
Today, large amounts of data is stored in the "cloud" or server. Data that is stored within these servers allow the data to be accessed from nearly anywhere. Because data within these severs are backed up, data persists. Data persistence is the lasting of data even after it has been deleted. An example would be an embarrassing picture uploaded to Facebook that is deleted within an hour. Even though the post may have been deleted, Facebook's server would have saved the picture. Even if the server didn't save the picture, other users could have saved the post.
Users may give up their privacy for utilities. An example of his would allowing your phone to track your location so you could find directions. If this information is leaked, it can be disastrous for the user. It is best to have a balance between privacy and utility so the user can have a safe yet productive experience.
Users may give up their privacy for utilities. An example of his would allowing your phone to track your location so you could find directions. If this information is leaked, it can be disastrous for the user. It is best to have a balance between privacy and utility so the user can have a safe yet productive experience.