Hopefully, one’s coding habits are constantly improving. If you feel any doubt about yourself, I suggest looking back at something you wrote 2011.
Day 01 – little helper checkdir
We at STATWORX work a lot with R and we often use the same little helper function within our projects. These functions ease our daily work life by reducing repetitive code parts or by creating overviews of our projects. At first, there was no plan to make a package, but soon I realised, that it will be much easier to share and improve those functions, if they are within a package. Up till the 24th December I will present one function each day from helfRlein
. So, on the 1st day of Christmas my true love gave to me…
If you did not already know
Clustered Monotone Transforms for Rating Factorization (CMTRF)
Exploiting low-rank structure of the user-item rating matrix has been the crux of many recommendation engines. However, existing recommendation engines force raters with heterogeneous behavior profiles to map their intrinsic rating scales to a common rating scale (e.g. 1-5). This non-linear transformation of the rating scale shatters the low-rank structure of the rating matrix, therefore resulting in a poor fit and consequentially, poor recommendations. In this paper, we propose Clustered Monotone Transforms for Rating Factorization (CMTRF), a novel approach to perform regression up to unknown monotonic transforms over unknown population segments. Essentially, for recommendation systems, the technique searches for monotonic transformations of the rating scales resulting in a better fit. This is combined with an underlying matrix factorization regression model that couples the user-wise ratings to exploit shared low dimensional structure. The rating scale transformations can be generated for each user, for a cluster of users, or for all the users at once, forming the basis of three simple and efficient algorithms proposed in this paper, all of which alternate between transformation of the rating scales and matrix factorization regression. Despite the non-convexity, CMTRF is theoretically shown to recover a unique solution under mild conditions. Experimental results on two synthetic and seven real-world datasets show that CMTRF outperforms other state-of-the-art baselines. …
Defining visualization literacy
Michael Correll on the use of “visualization literacy” in research:
WPI: Research Scientist [Worcester, MA]
At: WPILocation: Worcester, MA
Web: www.wpi.eduPosition: Research Scientist
The Future of AI is the Enterprise
lynn.heidmann@dataiku.com (Lynn Heidmann)
发表于
In early November, Dataiku CEO Florian Douetteau was a guest on BrightTALK’s Ask the Expert series, discussing *The Future of Artificial Intelligence, *including where exactly we are now and what’s to come in the field. The video of the talk is available here, or read on - this post also features top excerpts and highlights.
Simulating dinosaur populations, with R
So it turns out that the 1990 Michael Crichton novel Jurassic Park is, indeed, a work of fiction. (Personal note: despite the snark to follow, the book is one of my all-time favorites — I clearly remember devouring it in 24 hours straight while ill in a hostel in France.) If the monsters and melodrama didn’t give it away, then this chart seals the deal:
Simulating dinosaur populations, with R
So it turns out that the 1990 Michael Crichton novel Jurassic Park is, indeed, a work of fiction. (Personal note: despite the snark to follow, the book is one of my all-time favorites — I clearly remember devouring it in 24 hours straight while ill in a hostel in France.) If the monsters and melodrama didn’t give it away, then this chart seals the deal:
Introducing the First AI / Machine Learning Course With a Job Guarantee
Yeshiva University: Data Science Program Director [New York, NY]
At: Yeshiva University
Location: New York, NYWeb: www.yu.edu/katzPosition: Data Science Program Director