SunJackson Blog

When the bubble bursts…

转载自：http://hunch.net/?p=9604328

jl

发表于 2018-06-04

Consider the following facts:

阅读全文 »

Trustworthy Data Analysis

转载自：https://simplystatistics.org/2018/06/04/trustworthy-data-analysis/

未知

发表于 2018-06-04

Roger Peng ** 2018/06/04

阅读全文 »

rqdatatable： rquery Powered by data.table

转载自：http://www.win-vector.com/blog/2018/06/rqdatatable-rquery-powered-by-data-table/

John Mount

发表于 2018-06-03

rquery is an R package for specifying data transforms using piped Codd-style operators. It has already shown great performance on PostgreSQL and Apache Spark. rqdatatable is a new package that supplies a screaming fast implementation of the rquery system in-memory using the data.table package.

阅读全文 »

Data Links

转载自：https://rinzewind.org/blog-en/2018/data-links-156.html

José María Mateos

发表于 2018-06-03

Visibly the sentiment has quite considerably declined, there are much fewer tweets praising deep learning as the ultimate algorithm, the papers are becoming less “revolutionary” and much more “evolutionary”. Deepmind hasn’t shown anything breathtaking since their Alpha Go zero [and even that wasn’t that exciting, given the obscene amount of compute necessary and applicability to games only - see Moravec’s paradox]. OpenAI was rather quiet, with their last media outburst being the Dota 2 playing agent [which I suppose was meant to create as much buzz as Alpha Go, but fizzled out rather quickly]. In fact articles began showing up that even Google in fact does not know what to do with Deepmind, as their results are apparently not as practical as originally expected… As for the prominent researchers, they’ve been generally touring around meeting with government officials in Canada or France to secure their future grants, Yann Lecun even stepped down (rather symbolically) from the Head of Research to Chief AI scientist at Facebook. This gradual shift from rich, big corporations to government sponsored institutes suggests to me that the interest in this kind of research within these corporations (I think of Google and Facebook) is actually slowly winding down. Again these are all early signs, nothing spoken out loud, just the body language.

阅读全文 »

Lucy`s Secret Number puzzle

转载自：http://datagenetics.com/blog/june12018/index.html

未知

发表于 2018-06-03

Since there are four questions, and each answer can be yes or no, there are sixteen possible combinations of answers (Think of these like binary bits of a four bit number).

阅读全文 »

3368a9b98a073e7ba296e1f5f41f6c4f

转载自：https://www.becomingadatascientist.com/2018/06/01/craftydataviz-winners/

Renee

发表于 2018-06-02

About a month ago, on a whim, I posted the #CraftyDataViz contest, hoping for some beautiful and wacky homemade visualizations, and you all sure came through! The entries were gorgeous and the judging was super difficult!

阅读全文 »

Bulk Loading Shapefiles Into Postgres/Postgis

转载自：http://randyzwitch.com/bulk-loading-postgis/

未知

发表于 2018-06-01

Recently I’ve been doing a fair bit of work with geospatial data, mostly on the data preparation side. While there are common data formats, I have found that because so much of this data are sourced from government agencies, the data are often in many files that can be concatenated.

阅读全文 »

Python and Tidyverse

转载自：https://itsalocke.com/blog/python-and-tidyverse/

未知

发表于 2018-06-01

Introduction

阅读全文 »

Parallel, Disk-Efficient .zip to .gz Conversion

转载自：http://randyzwitch.com/zip-to-gzip-conversion-parallel/

未知

发表于 2018-06-01

Similar to my last post about needing to merge shapefiles using Postgis, I recently downloaded a bunch of energy data from the federal government. 13,370 files to be exact. While the data size itself isn’t that large (~8GB, compressed), an open-source tool I was looking to evaluate only supports gzip compression instead of the zip compressed files I actually had.

阅读全文 »

A crystal clear book draw

转载自：https://itsalocke.com/blog/a-crystal-clear-book-draw/

未知

发表于 2018-06-01

As you might know, every month, a random Locke Data Twitter follower wins an excellent data science book! This month’s gift was “An Introduction to Statistical Learning: with Applications in R”, a classic and useful textbook. In this post I’ll give you some magick-al tips from behind-the-scenes of this month’s winner announcement. It’ll feature learning from my mistakes, and reading from a crystal ball… or more seriously, image manipulation in R!

阅读全文 »