SunJackson Blog

History of Monte Carlo Methods - Part 1

转载自：http://www.nowozin.net/sebastian/blog/history-of-monte-carlo-methods-part-1.html

Sebastian Nowozin

发表于 2015-10-16

Some time ago in June 2013 I gave a lab tutorial on Monte Carlo methods at Microsoft Research. These tutorials are seminar-talk length (45 minutes) but are supposed to be light, accessible to a general computer science audience, and fun.

阅读全文 »

Clustering debates from UK politicians

转载自：https://blog.lateral.io/2015/10/clustering-debates-from-uk-politicians/

Stephen

发表于 2015-10-16

What kind of language do British parliamentarians use? We scraped, parsed and vectorised a sample of recent debates from the House of Commons. We then applied a k-means clustering algorithm to these vectors, and created a word cloud for each cluster. Click on the image below for an enlarged version.

阅读全文 »

Generating Fibonacci Numbers

转载自：http://datagenetics.com/blog/october22015/index.html

未知

发表于 2015-10-16

This article is about algorithms that can be used to generate Fibonacci numbers.

阅读全文 »

Emoticons decoder for social media sentiment analysis in R

转载自：http://opiateforthemass.es/articles/emoticons-in-R/

Jessica Peterka-Bonetta (jessica@today-is-a-good-day.de)

发表于 2015-10-16

If you have ever retrieved data from Twitter, Facebook or Instagram with R, you might have noticed a strange phenomenon. While R seems to be able to display some emoticons properly, many other times it doesn’t, making any further analysis impossible unless you get rid of them. With a little hack, I decoded these emoticons and put them all in a dictionary for further use. I’ll explain how I did it and share the decoder with you.

阅读全文 »

7 tools in every data scientist’s toolbox

转载自：http://blog.datadive.net/7-tools-in-every-data-scientists-toolbox/

ando

发表于 2015-10-15

There is huge number of machine learning methods, statistical tools and data mining techniques available for a given data related task, from self organizing maps to Q-learning, from streaming graph algorithms to gradient boosted trees. Many of these methods, while powerful in specific domains and problem setups, are arcane and utilized or even understood by few.

阅读全文 »

Denoising Dirty Documents： Part 9

转载自：https://colinpriest.com/2015/10/15/denoising-dirty-documents-part-9/

Colin Priest

发表于 2015-10-15

Now that Kaggle’s Denoising Dirty Documents Competition has closed, it’s time to start posting the secrets to getting a very good score in this competition. In this blog, I describe how to take advantage of the first of two information leakages that I used.

阅读全文 »

Deep Learning Startups, Applications and Acquisitions – A Summary

转载自：http://blog.dennybritz.com/2015/10/13/deep-learning-startups-applications-and-acquisitions-a-summary/

Denny Britz

发表于 2015-10-13

Most major tech companies are use Deep Learning techniques in one way or another, and many have new initiatives on the way. Self-driving cars use Deep Learning to model their environment. Siri, Cortana and Google Now use it for speech recognition, Facebook for facial recognition, and Skype for real-time translation.

阅读全文 »

LOCF and Linear Imputation with PostgreSQL

转载自：https://www.joyofdata.de/blog/locf-linear-imputation-postgresql-tutorial/

Raffael Vogler

发表于 2015-10-11

This tutorial will introduce various tools offered by PostgreSQL, and SQL in general – like custom functions, window functions, aggregate functions, WITH clause (or CTE for Common Table Expression) – for the purpose of implementing a program which imputes numeric observations within a column applying linear interpolation where possible and forward and backward padding where necessary. I’m going to progressively add and explain those constructs, step by step, so no problem if you are new to the scene. I am very much interested in input regarding potential downsides of the implementation and possible improvements.

阅读全文 »

How-to： Build a Machine-Learning App Using Sparkling Water and Apache Spark

转载自：http://blog.cloudera.com/blog/2015/10/how-to-build-a-machine-learning-app-using-sparkling-water-and-apache-spark/

Justin Kestelyn

发表于 2015-10-08

Thanks to Michal Malohlava, Amy Wang, and Avni Wadhwa of H20.ai for providing the following guest post about building ML apps using Sparkling Water and Apache Spark on CDH.

阅读全文 »

On the consistency of ordinal regression methods

转载自：http://fa.bianp.net/blog/2015/on-the-consistency-of-ordinal-regression-methods/

Fabian Pedregosa

发表于 2015-10-08

My latests work (with Francis Bach and Alexandre Gramfort) is on the consistency of ordinal regression methods. It has the wildly imaginative title of “On the Consistency of Ordinal Regression Methods” and is currently under review but you can read the draft of it on ArXiv. If you have any thoughts about it, please leave me a comment!

阅读全文 »