Skip to content
The Data Lass

The Data Lass

In God we trust, all others must bring data.

  • About
  • Portfolio
  • Mentions/Publications
  • More

    Natural Language Processing: N-gram Extraction for World Cup

    In last month’s blog, I talked about sentiment analysis for social media analysis in the field of computational linguistics, which takes human language and translates it so… Read more “Natural Language Processing: N-gram Extraction for World Cup”

    July 20, 2019July 22, 2019 by thedatalass

    Four-Year Anniversary Blog: Sentiment Analysis

    Today I celebrate the four-year anniversary of my blog that tries to explain key data science ideas in plain language and socialize what continues to make its… Read more “Four-Year Anniversary Blog: Sentiment Analysis”

    June 20, 2019June 24, 2019 by thedatalass

    Artificial Neural Networks for Predicting Coffee Rust Case Study

    After looking at the nuts and bolts of natural language processing in my last blog, today I want to look at how artificial neural networks (ANNs) can… Read more “Artificial Neural Networks for Predicting Coffee Rust Case Study”

    May 24, 2019 by thedatalass

    NLP Punctuation, Lower-Case and StopWords Pre-Processing

    In my March blog, I explained how to use the stemming technique in Natural Language Processing (NLP) to predict whether a particular Tweet could be geolocated to… Read more “NLP Punctuation, Lower-Case and StopWords Pre-Processing”

    April 26, 2019May 24, 2019 by thedatalass

    NLP Stemming

    In my February blog, I explained how to use the tokenization technique in Natural Language Processing (NLP) to predict whether a particular Tweet could be geolocated to… Read more “NLP Stemming”

    March 26, 2019March 26, 2019 by thedatalass

    NLP Tokenization

    Image from aio-tv Today I want to continue looking at machine learning case studies for beginners and in particular, the use of tokenization in natural language processing.… Read more “NLP Tokenization”

    February 15, 2019March 26, 2019 by thedatalass

    Missing Data with k-Nearest Neighbor Imputation

    In today’s blog, I want to give a case study of using k-Nearest Neighbor (kNN) imputation to fill in missing data. About a year ago, I talked… Read more “Missing Data with k-Nearest Neighbor Imputation”

    January 14, 2019February 15, 2019 by thedatalass

    Machine Learning Algorithm Case Study 4: Spearman’s Dimensionality Reduction

    In a September 2018 blog, I talked about a K-means clustering case study of cyber profiling in Indonesia. Today I want to continue that discussion by giving… Read more “Machine Learning Algorithm Case Study 4: Spearman’s Dimensionality Reduction”

    December 11, 2018December 11, 2018 by thedatalass

    The Data Scientist Clarifies the Question – Dengue Data Search

    Originally posted on The Data Lass:
    One of the first steps in the Data Science process is identifying what data you need to answer the question. In…

    November 8, 2018November 8, 2018 by thedatalass

    The Data Scientist Clarifies the Question

    One of the first steps in the Data Science process is identifying what data you need to answer the question. In March 2017, I featured a series… Read more “The Data Scientist Clarifies the Question”

    October 16, 2018October 17, 2018 by thedatalass

    Posts navigation

    Older posts

    Recent Posts

    • Natural Language Processing: N-gram Extraction for World Cup
    • Four-Year Anniversary Blog: Sentiment Analysis
    • Artificial Neural Networks for Predicting Coffee Rust Case Study
    • NLP Punctuation, Lower-Case and StopWords Pre-Processing
    • NLP Stemming

    Recent Comments

    lkafle on The Ethics of Artificial Intel…
    The Ethics of Artifi… on Big Data Informatics –…
    The Ethics of Artifi… on Predicting with Genetic Data,…
    Back from JupyterCon… on Food Security & Sustainabi…
    86Maryjo on Types of Data Scientists, Part…

    Archives

    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • June 2015

    Categories

    • Agricultural Data
    • Big Data Applications
    • Cloud Computing
    • Crowdsourcing
    • Data Ethics
    • data journalism
    • Data Mining
    • data science graduate school
    • Data Talent
    • data visualization
    • Development Data
    • economic data
    • environmental data
    • financial data
    • Food Waste
    • Graph Databases
    • health data
    • Informatics
    • Machine Learning
    • Natural Language Processing
    • Nutrition Data
    • open data
    • patent data
    • programming
    • Project Management
    • sports data
    • Statistics
    • technology
    • transportation data
    • Uncategorized

    Meta

    • Register
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.com
    Advertisements
    • About
    Blog at WordPress.com.
    Cancel
    Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
    To find out more, including how to control cookies, see here: Cookie Policy