data roundup

Python Data Weekly Roundup – Jan 10 2020

In this week’s Python Data Weekly Roundup:

A Comprehensive Learning Path to Understand and Master NLP in 2020

If you’re looking to learn more about Natural Language Processing (NLP) in 2020, this is a very good article describing a good learning path to take including links to articles, courses, videos and more to get you started down the road of becoming proficient with the tools and methods of NLP.

The Best of Both Worlds: Forecasting US Equity Market Returns using a Hybrid Machine Learning – Time Series Approach


Predicting long-term equity market returns is of great importance for investors to strategically allocate their assets. We apply machine learning methods to forecast 10-year-ahead U.S. stock returns and compare the results to traditional Shiller regression-based forecasts more commonly used in the asset-management industry. Machine-learning forecasts have similar forecast errors to a traditional return forecast model based on lagged CAPE ratios. However, machine-learning forecasts have higher forecast errors than the regression-based, two-step approach of Davis et al [2018] that forecasts the CAPE ratio based on macroeconomic variables and then imputes stock returns. When we combine our two-step approach with machine learning to forecast CAPE ratios (a hybrid ML-VAR approach), U.S. stock return forecasts are statistically and economically more accurate than all other approaches. We discuss why and conclude with some best practices for both data scientists and economists in making real-world investment return forecasts.

 Improving U.S. stock return forecasts: A “fair-value” CAPE approach
Source: Improving U.S. stock return forecasts: A “fair-value” CAPE approach

Building machine learning workflows with AWS Data Exchange and Amazon SageMaker

This article describes how to use AWS’ Sagemaker and Data Exchagne to build a machine learning model and machine learning workflows.   What I found interesting is the ability to use AWS Data Exchange to find a large number of different types of data.

Tutorial: Python Regex (Regular Expressions) for Data Scientists

I hate regex. Of course I love the functionality and capabilities of using regex, but I loathe my inability to come up with my own regex ‘formulas’. I *always* have to go out on the web to search for how to do what I’m trying to do.  This article doesn’t solve that problem for me, but it does provide a refresher in regex patterns and a reminder why regex is important.

That’s it for this week’s Python Data Weekly Roundup. Subscribe to our newsletter to receive this weekly roundup in your email.


Leave a Reply

1 Comment threads
0 Thread replies
Most reacted comment
Hottest comment thread
0 Comment authors
Recent comment authors
newest oldest most voted
Notify of