I am a government professional with over a decade of data product consultant, leadership, project management, editing/writing and patent examination experience. I graduated in May 2018 with a Master’s of Science in Data Science from Indiana University. This blog covers a wide range of data topics presented with a non-technical audience in mind.

Laura H. Kahn

laurakahn2@gmail.com | @LauraHKahn  | GitHub


Data Scientist with strong analytical background and 2 years of graduate school experience using predictive modeling, data processing, and data mining algorithms to solve challenging problems. Public servant with 11 years experience in data product management and statistical analysis. Involved in data literacy projects and passionate about new applications for artificial intelligence.


United States Patent and Trademark Office, Information Management Specialist (October 2007 – present)

  • Managed data products by interacting with software developers to release and maintain two bulk data systems
  • Collaborated with system operations staff to solve technical problems to ensure customer access to data 24/7
  • Expert in extracting and reporting insights from system usage statistics with Google Analytics for high-level decision makers
  • Translated customer needs into technical requirements for Agency’s first API project in an Agile environment
  • Expert in presenting technical material to a wide range of audiences – from beginner to executive levels
  • Earned Bronze Medal Award (2017) for five years of consecutive outstanding performance

Indiana University, Data Science Graduate Student (January 2016 – May 2018)

Project – Georeferenced Tweet Location Prediction with NLP

  • Linked keywords in 1.32 million Tweets within a location of 10 km using logistic regression and kNN with accuracy improvement of up to 40% and log loss of 1.53.
  • Paper presented at the 2018 SMAP Conference

Project – Use of Artificial Neural Networks for Predicting Poverty Competitions

  • Used PCA and correlation matrix for dimensionality reduction of features
  • Selected multilayer perceptron neural network models with F1 scores of 0.85 (World Bank) and 0.99 (Kaggle)

Project – Algorithmic Trading of Coffee Futures with Machine Learning

  • Predicted daily coffee futures closing prices time-series data with autoregressive models
  • Models resulted in maximum percent prediction error of 0.00328.

United States Patent and Trademark Office, Patent Examiner (September 2004 – October 2007)

  • Conducted technology research for 150+ patent applications and communicated legal findings to customers
  • Received highest performance rating based on work quality and timeliness metrics


  • Big Data Acquisition: Python BeautifulSoup, Scrapy and PySpark libraries
  • Data Cleaning: Python Pandas
  • Data Modeling: Python PyModels library
  • Data Mining: Python SciPy, Numpy, SciKit-Learn, SQLAlchemy libraries
  • Jupyter integrated development framework
  • Natural Language Processing (English and Spanish): Python NLTK library
  • Machine Learning: Feature Engineering and Deep Learning (Python Keras, Apache Spark ML library)
  • Data Visualization: Tableau, Python Matplot, Plotly and Bokeh libraries
  • Univariate Statistical Analysis: Python Statsmodels library
  • Bayesian Inference: Python BayesPy library
  • Network Theory: Neo4j, Social Network Theory
  • Geospatial Analysis: QGIS
  • Program Management: Communication, Leadership, Organization, Problem Solving


Indiana University, M.S. Data Science, January 2016 – May 2018
North Carolina State University, B.S. Textile Engineering and B.A. Spanish, August 1999 – May 2004 Universidad de Santander – Study Abroad for Spanish Language, September – December 2003


Kahn, Laura H. “Spatiotemporal twitter analysis of the Venezuelan food crisis.” Journal of Food Processing Technology, 8:5. (2017): 51. Proceedings from the 2nd International Conference on Food Security and Sustainability.         https://www.omicsonline.org/conference-proceedings/2157-7110-C1-062-011.pdf

Kahn, Laura H. “Chévere! Text-Based Twitter Patterns from Venezuelan Food Shortages.” 2018 13th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP), Zaragoza, 2018, pp. XX-XX. (IEEE URL forthcoming)


One thought on “About

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.