CrowdFlower survey of data scientists

We’re working with CrowdFlower on a fun little project. If you’re a data scientist, take this 5-minute survey about your work, and in exchange you’ll get the data afterwards (and hopefully we’ll all learn a thing or two).

How American Injures Itself

We found a fun dataset to play with. The U.S. government’s National Electronic Injury Surveillance System tracks a huge sample of all accidental injuries that happen in the US. We’ve cleaned up the data, put it into Statwing, and made it really easy to play with. Play around with it here (Here’s a little teaser,Continue Reading

Rundown of data tools

Our friends over at just published a blog post walking through dozens of different data tools (Hadoop, data visualization, databases, etc.). It’s not exhaustive but it’s written in human-friendly language, so we like it.

It’s football season again!

So we’d love to direct your attention to our favorite Statwing blog post of old, Ten Years of NFL Plays Analyzed, Visualized, Quizzified. If you like NFL football, you’ll love that post. If you don’t love NFL football, here’s a lovely post about the human lifecycle presented via responses to the General Social Survey.

Ridge regression

Statwing now enables you to use any of three kinds of regression. 1. Ordinary Least Squares (OLS): OLS is the most common kind of regression. 2. M-estimation: M-estimation regression downweights the impact of outliers. One problem with OLS regression is that if the variable being predicted has an outlier (e.g., most values are between 100Continue Reading

Appraisers: Give 10% off Statwing, Get 10% off Statwing

By giving a 10% discount off of Statwing to other appraisers, you can now get Statwing for 10% off, 20% off, or even for free. When you next sign into Statwing you’ll see a “Get 10% off” button in the upper right: Click that to get a personalized link: Copy that link and send itContinue Reading

Fannie Mae: GLA Adjustments Should Be Higher

Fannie Mae announced last week that it is concerned that many GLA adjustments are “artificially low.” As evidence, they noted that more expensive homes have much higher Price/GLAs than less expensive homes, but only slightly higher GLA adjustments: (Click for a larger version)   Fannie Mae implicitly blamed both itself and automated review systems: “TheContinue Reading

Delightful logistic regression

Not unexpectedly, we followed up linear¬†regression with logistic regression. It’s ready to go, so give it a shot with our demo dataset, predicting how likely someone is to be married based on their age, sex, religion, and anything else you’d like to use to predict it:¬†

Delightful multiple linear regression is finally here!

We’ve finally finished multiple linear regression, and it’s awesome. You’ll never find an easier way to disentangle how input variables affect an output variable. Results explained in plain English Automatic alerts about issues with the regression model, and how to fix them Automatic visualizations Two-click data transformations to improve your model Plainly written guides toContinue Reading

Statwing integrates with Quandl

Next time you use Quandl, click the “Statwing” button to statistically analyze a Quandl dataset. Let’s say you were looking at this Quandl dataset of bitcoin prices over time. You’d click the “Statwing” button in the right sidebar, find yourself in Statwing, and then start playing with the data. In two clicks you can getContinue Reading