Cloudera Data Science Workbench (CDSW) provides data science teams with a self-service platform for quickly developing machine learning workloads in their preferred language, with secure access to enterprise data and simple provisioning of compute. Individuals can request schedulable resources (e.g. compute, memory, GPUs) on a shared cluster that is managed centrally.
While self-service provisioning of resources is critical to the rapid interaction cycle of data scientists, it can pose a challenge to administrators. Administrators are responsible for ensuring sufficient resources are available to support user demand with good performance, while simultaneously keeping costs low. The key to this balance is understanding how the environment is being used.
In CDSW 1.2, we’ve improved the ability of administrators to monitor user activity and resource usage.
Administrators can now:
-
Identify basic usage patterns and debug failuresBasic user statistics enable administrators to understand when and how the system is being used. They can report product adoption status to stakeholders, address system abuse, and quickly determine why jobs are failing.
-
**Capacity plan **CDSW is currently deployed on CDH gateway nodes, which must be provisioned for expected capacity. Administrators can now easily monitor how much of system capacity (CPU/GPU/memory/storage) is used over time and plan accordingly.
-
Rationalize licensesAdministrators can now tell who is actually using CDSW to determine if they should adjust the number of licensed users.
-
Implement chargeback pricingCDSW is often deployed on top of a shared CDH cluster, on behalf of multiple line-of-business teams who work with common and secured data. Administrators can now see total usage statistics by users and teams to better implement chargeback accounting.
These tasks can be accomplished by using the updated Users and Activity tabs within the CDSW Administration panel.
User Activity Monitoring
The users summary table now displays more metrics on a per user basis. All columns are sortable, which means administrators can easily discover top users by various metrics and verify engagement. Administrators can validate product adoption, implement chargeback models for teams and identify top consumers.
For deeper analysis, administrators can download the usage data to a CSV file for analysis in a spreadsheet or CDSW itself.
Resource Usage Monitoring
Navigating to the Site Administration / Activity panel now exposes several time series charts to help track resources over time and link these resources to particular sessions and jobs. For a chosen date range, admins have a window into the CPU, Memory, and GPU usage tendencies of their installation. To support more debugging scenarios, they can also graph the number of running sessions and scheduling and startup lag of sessions and jobs. This allows admins to observe overall resource usage patterns, capacity plan based on historical data from their own cluster, and debug cluster-wide issues that may be lowering performance.
Again, as there may be even more insight administrators wish to extract on their own, full user activity and resource usage data can be exported to CSV with the click of a button.
We hope these new capabilities help Cloudera administrators gain better visibility into their Cloudera Data Science Workbench deployments. Want to learn more? Check out the Cloudera Data Science Workbench product overview or explore the latest documentation.