Examples of Unstructured Data

Vishal Singh https://vishalsingh.org

Modern life is inundated with data, much of which is being generated in unstructured forms such as text, audio, video, and images. When trained over large samples & computing resources, ML/AI algorithms have shown great promise in generating semantic meaning from these unstructured data. Examples include technologies in our daily lives such as spell check of documents and spam filtering of emails; to speech recognition (e.g. Siri / Alexa) and image recognition with applications ranging from medicine to autonomous driving. This document provides some examples along this line of work.

Google Ngram

Quick Read: WIRED The Pitfalls of using Google ngram

Science Article: Quantitative Analysis of Culture Using Millions of Digitized Books

Making sense of Textual Data: Perhaps the most ambitious project on text analytics providing momentum to this line of inquiry

Google Cloud NLP

Great API Give it a try to see what information you can extract from Text

Google Translate

Try it in Google Sheets

Twitter with GPU Power

Watch on Youtube


N.Y. based startup with a fantastic API for image recognization and classification

Television Explorer

A collboration between Internet archive & GLDET. Television Explorer allows you to keyword search the closed captioning streams of the Archive’s 6 years of American television news and explore macro-level trends in how America’s television news is shaping the conversation around key societal issues.