SunJackson Blog

Automated Web Scraping in R

转载自：http://feedproxy.google.com/~r/kdnuggets-data-mining-analytics/~3/OdFfqsS4qNk/automated-web-scraping-r.html

Dan Clark

发表于 2018-12-11

Sponsored Post.By Rebecca Merrett, Instructor at Data Science DojoThere are many blogs and tutorials that teach you how to scrape data from a bunch of web pages once and then you’re done. But one-off web scraping is not useful for many applications that require sentiment analysis on recent or timely content, or capturing changing events and commentary, or analyzing trends in real time. As fun as it is to do an academic exercise of web scraping for one-off analysis on historical data, it is not useful to when wanting to use timely or frequently updated data.

阅读全文 »

P&G： Data Scientist – Machine Learning/NLP [Cincinnati, OH]

转载自：http://feedproxy.google.com/~r/kdnuggets-data-mining-analytics/~3/0PdRpYUd278/12-11-pg-data-scientist-machine-learning-nlp.html

Matt Mayo Editor

发表于 2018-12-11

At: P&G Location: Cincinnati, OHWeb: www.pg.comPosition: Data Scientist - Machine Learning/NLP

阅读全文 »

InformationAge： Will 2019 See the Automation of Automation and Push Up Salaries of Data Scientists?

转载自：https://www.information-age.com/will-2019-see-the-automation-of-automation-and-push-up-salaries-of-data-scientists-123477045/

Gareth Goh

发表于 2018-12-11

阅读全文 »

Historic Wildfire Data： Exploratory Visualization in R

转载自：https://www.dataquest.io/blog/r-data-viz-tutorial/

Rose Martin

发表于 2018-12-11

In recent weeks, news of the devastating wildfires sweeping parts of the US state of California have featured prominently in the news.

阅读全文 »

If you did not already know

转载自：https://analytixon.com/2018/12/11/if-you-did-not-already-know-573/

Michael Laux

发表于 2018-12-11

TEA-DNN Embedded deep learning platforms have witnessed two simultaneous improvements. First, the accuracy of convolutional neural networks (CNNs) has been significantly improved through the use of automated neural-architecture search (NAS) algorithms to determine CNN structure. Second, there has been increasing interest in developing application-specific platforms for CNNs that provide improved inference performance and energy consumption as compared to GPUs. Embedded deep learning platforms differ in the amount of compute resources and memory-access bandwidth, which would affect performance and energy consumption of CNNs. It is therefore critical to consider the available hardware resources in the network architecture search. To this end, we introduce TEA-DNN, a NAS algorithm targeting multi-objective optimization of execution time, energy consumption, and classification accuracy of CNN workloads on embedded architectures. TEA-DNN leverages energy and execution time measurements on embedded hardware when exploring the Pareto-optimal curves across accuracy, execution time, and energy consumption and does not require additional effort to model the underlying hardware. We apply TEA-DNN for image classification on actual embedded platforms (NVIDIA Jetson TX2 and Intel Movidius Neural Compute Stick). We highlight the Pareto-optimal operating points that emphasize the necessity to explicitly consider hardware characteristics in the search process. To the best of our knowledge, this is the most comprehensive study of Pareto-optimal models across a range of hardware platforms using actual measurements on hardware to obtain objective values. …

阅读全文 »

CBH Group： Data Scientist [Perth, Australia]

转载自：http://feedproxy.google.com/~r/kdnuggets-data-mining-analytics/~3/nwEVzb_n2OE/12-11-cbh-group-data-scientist.html

Matt Mayo Editor

发表于 2018-12-11

At: CBH Group Location: Perth, AustraliaWeb: cbh.com.auPosition: Data Scientist

阅读全文 »

R Packages worth a look

转载自：https://analytixon.com/2018/12/11/r-packages-worth-a-look-1363/

Michael Laux

发表于 2018-12-11

Tools for Tensor Analysis and Decomposition (rTensor)A set of tools for creation, manipulation, and modeling of tensors with arbitrary number of modes. A tensor in the context of data analysis is a multid …

阅读全文 »

Intuit： Staff Data Scientist [Woodland Hills, CA and Mountain View, CA]

转载自：http://feedproxy.google.com/~r/kdnuggets-data-mining-analytics/~3/ARcoAUpKL9o/12-11-intuit-data-scientist-risk.html

Matt Mayo Editor

发表于 2018-12-11

At: Intuit Location: Woodland Hills, CA and Mountain View, CAWeb: intuit.comPosition: Staff Data Scientist

阅读全文 »

Sharing Modeling Pipelines in R

转载自：http://feedproxy.google.com/~r/RBloggers/~3/4ZW3KdE95sA/

John Mount

发表于 2018-12-11

Reusable modeling pipelines are a practical idea that gets re-developed many times in many contexts. wrapr supplies a particularly powerful pipeline notation, and a pipe-stage re-use system (notes here). We will demonstrate this with the vtreat data preparation system.

阅读全文 »

When cycling is faster than driving

转载自：https://flowingdata.com/2018/12/11/when-cycling-is-faster-than-driving/

Nathan Yau

发表于 2018-12-11

Deliveroo is a service that picks up and delivers food. Data from their delivery riders showed that it was faster to ride a bike than other modes of transportation in cities. Carlton Reid for Forbes:

阅读全文 »