- What is the Azure Data Science Virtual Machine for Linux and Windows?
- Comparison with Azure Machine Learning
- Comparison with AzureML Compute Instances
- Sample Use Cases
- Short-term experimentation and evaluation
- Deep learning with GPUs
- Data science training and education
- What’s included on the DSVM?
- Windows virtual machines in Azure
- What do I need to think about before creating a VM?
- Locations
- Availability
- VM size
- VM Limits
- Operating system disks and images
- Extensions
- Related resources
- Data residency
- What is a virtual machine (VM)?
- Virtual machines: virtual computers within computers
- Explore virtual machines and the cloud with an Azure free account
- How does a virtual machine work?
- What are VMs used for?
- What are the benefits of using VMs?
- Get started with virtual machines
What is the Azure Data Science Virtual Machine for Linux and Windows?
The Data Science Virtual Machine (DSVM) is a customized VM image on the Azure cloud platform built specifically for doing data science. It has many popular data science tools preinstalled and pre-configured to jump-start building intelligent applications for advanced analytics.
The DSVM is available on:
- Windows Server 2019
- Ubuntu 18.04 LTS
Comparison with Azure Machine Learning
The DSVM is a customized VM image for Data Science but Azure Machine Learning (AzureML) is an end-to-end platform that encompasses:
- Fully Managed Compute
- Compute Instances
- Compute Clusters for distributed ML tasks
- Inference Clusters for real-time scoring
- Datastores (for example Blob, ADLS Gen2, SQL DB)
- Experiment tracking
- Model management
- Notebooks
- Environments (manage conda and R dependencies)
- Labeling
- Pipelines (automate End-to-End Data science workflows)
Comparison with AzureML Compute Instances
Azure Machine Learning Compute Instances are a fully configured and managed VM image whereas the DSVM is an unmanaged VM.
The key differences between these two product offerings are detailed below:
Feature | Data Science VM | AzureML Compute Instance |
---|---|---|
Fully Managed | No | Yes |
Language Support | Python, R, Julia, SQL, C#, Java, Node.js, F# | Python and R |
Operating System | Ubuntu Windows | Ubuntu |
Pre-Configured GPU Option | Yes | Yes |
Scale up option | Yes | Yes |
SSH Access | Yes | Yes |
RDP Access | Yes | No |
Built-in Hosted Notebooks | No (requires additional configuration) | Yes |
Built-in SSO | No (requires additional configuration) | Yes |
Built-in Collaboration | No | Yes |
Pre-installed Tools | Jupyter(lab), RStudio Server, VSCode, Visual Studio, PyCharm, Juno, Power BI Desktop, SSMS, Microsoft Office 365, Apache Drill | Jupyter(lab) RStudio Server |
Sample Use Cases
Below we illustrate some common use cases for DSVM customers.
Short-term experimentation and evaluation
You can use the DSVM to evaluate or learn new data science tools, especially by going through some of our published samples and walkthroughs.
Deep learning with GPUs
In the DSVM, your training models can use deep learning algorithms on hardware that’s based on graphics processing units (GPUs). By taking advantage of the VM scaling capabilities of the Azure platform, the DSVM helps you use GPU-based hardware in the cloud according to your needs. You can switch to a GPU-based VM when you’re training large models, or when you need high-speed computations while keeping the same OS disk. You can choose any of the N series GPUs enabled virtual machine SKUs with DSVM. Note GPU enabled virtual machine SKUs are not supported on Azure free accounts.
The Windows editions of the DSVM come pre-installed with GPU drivers, frameworks, and GPU versions of deep learning frameworks. On the Linux editions, deep learning on GPUs is enabled on the Ubuntu DSVMs.
You can also deploy the Ubuntu or Windows editions of the DSVM to an Azure virtual machine that isn’t based on GPUs. In this case, all the deep learning frameworks will fall back to the CPU mode.
Data science training and education
Enterprise trainers and educators who teach data science classes usually provide a virtual machine image. The image ensures students have a consistent setup and that the samples work predictably.
The DSVM creates an on-demand environment with a consistent setup that eases the support and incompatibility challenges. Cases where these environments need to be built frequently, especially for shorter training classes, benefit substantially.
What’s included on the DSVM?
See a full list of tools on both the Windows and Linux DSVMs here.
Windows virtual machines in Azure
Azure Virtual Machines (VM) is one of several types of on-demand, scalable computing resources that Azure offers. Typically, you choose a VM when you need more control over the computing environment than the other choices offer. This article gives you information about what you should consider before you create a VM, how you create it, and how you manage it.
An Azure VM gives you the flexibility of virtualization without having to buy and maintain the physical hardware that runs it. However, you still need to maintain the VM by performing tasks, such as configuring, patching, and installing the software that runs on it.
Azure virtual machines can be used in various ways. Some examples are:
- Development and test – Azure VMs offer a quick and easy way to create a computer with specific configurations required to code and test an application.
- Applications in the cloud – Because demand for your application can fluctuate, it might make economic sense to run it on a VM in Azure. You pay for extra VMs when you need them and shut them down when you don’t.
- Extended datacenter – Virtual machines in an Azure virtual network can easily be connected to your organization’s network.
The number of VMs that your application uses can scale up and out to whatever is required to meet your needs.
What do I need to think about before creating a VM?
There are always a multitude of design considerations when you build out an application infrastructure in Azure. These aspects of a VM are important to think about before you start:
- The names of your application resources
- The location where the resources are stored
- The size of the VM
- The maximum number of VMs that can be created
- The operating system that the VM runs
- The configuration of the VM after it starts
- The related resources that the VM needs
Locations
All resources created in Azure are distributed across multiple geographical regions around the world. Usually, the region is called location when you create a VM. For a VM, the location specifies where the virtual hard disks are stored.
This table shows some of the ways you can get a list of available locations.
Method | Description |
---|---|
Azure portal | Select a location from the list when you create a VM. |
Azure PowerShell | Use the Get-AzLocation command. |
REST API | Use the List locations operation. |
Azure CLI | Use the az account list-locations operation. |
Availability
Azure announced an industry leading single instance virtual machine Service Level Agreement of 99.9% provided you deploy the VM with premium storage for all disks. In order for your deployment to qualify for the standard 99.95% VM Service Level Agreement, you still need to deploy two or more VMs running your workload inside of an availability set. An availability set ensures that your VMs are distributed across multiple fault domains in the Azure data centers as well as deployed onto hosts with different maintenance windows. The full Azure SLA explains the guaranteed availability of Azure as a whole.
VM size
The size of the VM that you use is determined by the workload that you want to run. The size that you choose then determines factors such as processing power, memory, and storage capacity. Azure offers a wide variety of sizes to support many types of uses.
Azure charges an hourly price based on the VM’s size and operating system. For partial hours, Azure charges only for the minutes used. Storage is priced and charged separately.
VM Limits
Your subscription has default quota limits in place that could impact the deployment of many VMs for your project. The current limit on a per subscription basis is 20 VMs per region. Limits can be raised by filing a support ticket requesting an increase
Operating system disks and images
Virtual machines use virtual hard disks (VHDs) to store their operating system (OS) and data. VHDs are also used for the images you can choose from to install an OS.
Azure provides many marketplace images to use with various versions and types of Windows Server operating systems. Marketplace images are identified by image publisher, offer, sku, and version (typically version is specified as latest). Only 64-bit operating systems are supported. For more information on the supported guest operating systems, roles, and features, see Microsoft server software support for Microsoft Azure virtual machines.
This table shows some ways that you can find the information for an image.
Method | Description |
---|---|
Azure portal | The values are automatically specified for you when you select an image to use. |
Azure PowerShell | Get-AzVMImagePublisher -Location location Get-AzVMImageOffer -Location location -Publisher publisherName Get-AzVMImageSku -Location location -Publisher publisherName -Offer offerName |
REST APIs | List image publishers List image offers List image skus |
Azure CLI | az vm image list-publishers —location location az vm image list-offers —location location —publisher publisherName az vm image list-skus —location location —publisher publisherName —offer offerName |
You can choose to upload and use your own image and when you do, the publisher name, offer, and sku aren’t used.
Extensions
VM extensions give your VM additional capabilities through post deployment configuration and automated tasks.
These common tasks can be accomplished using extensions:
- Run custom scripts – The Custom Script Extension helps you configure workloads on the VM by running your script when the VM is provisioned.
- Deploy and manage configurations – The PowerShell Desired State Configuration (DSC) Extension helps you set up DSC on a VM to manage configurations and environments.
- Collect diagnostics data – The Azure Diagnostics Extension helps you configure the VM to collect diagnostics data that can be used to monitor the health of your application.
Related resources
The resources in this table are used by the VM and need to exist or be created when the VM is created.
Resource | Required | Description |
---|---|---|
Resource group | Yes | The VM must be contained in a resource group. |
Storage account | Yes | The VM needs the storage account to store its virtual hard disks. |
Virtual network | Yes | The VM must be a member of a virtual network. |
Public IP address | No | The VM can have a public IP address assigned to it to remotely access it. |
Network interface | Yes | The VM needs the network interface to communicate in the network. |
Data disks | No | The VM can include data disks to expand storage capabilities. |
Data residency
In Azure, the feature to enable storing customer data in a single region is currently only available in the Southeast Asia Region (Singapore) of the Asia Pacific Geo and Brazil South (Sao Paulo State) Region of Brazil Geo. For all other regions, customer data is stored in Geo. For more information, see Trust Center.
What is a virtual machine (VM)?
An intro to virtualization and the benefits of VMs
Virtual machines: virtual computers within computers
A virtual machine, commonly shortened to just VM, is no different than any other physical computer like a laptop, smart phone, or server. It has a CPU, memory, disks to store your files, and can connect to the internet if needed. While the parts that make up your computer (called hardware) are physical and tangible, VMs are often thought of as virtual computers or software-defined computers within physical servers, existing only as code.
Explore virtual machines and the cloud with an Azure free account
Create, deploy, and monitor VMs using 12 months of free services
How does a virtual machine work?
Virtualization is the process of creating a software-based, or «virtual» version of a computer, with dedicated amounts of CPU, memory, and storage that are «borrowed» from a physical host computer—such as your personal computer— and/or a remote server—such as a server in a cloud provider’s datacenter. A virtual machine is a computer file, typically called an image, that behaves like an actual computer. It can run in a window as a separate computing environment, often to run a different operating system—or even to function as the user’s entire computer experience—as is common on many people’s work computers. The virtual machine is partitioned from the rest of the system, meaning that the software inside a VM can’t interfere with the host computer’s primary operating system.
What are VMs used for?
Here are a few ways virtual machines are used:
- Building and deploying apps to the cloud.
- Trying out a new operating system (OS), including beta releases.
- Spinning up a new environment to make it simpler and quicker for developers to run dev-test scenarios.
- Backing up your existing OS.
- Accessing virus-infected data or running an old application by installing an older OS.
- Running software or apps on operating systems that they weren’t originally intended for.
What are the benefits of using VMs?
While virtual machines run like individual computers with individual operating systems and applications, they have the advantage of remaining completely independent of one another and the physical host machine. A piece of software called a hypervisor, or virtual machine manager, lets you run different operating systems on different virtual machines at the same time. This makes it possible to run Linux VMs, for example, on a Windows OS, or to run an earlier version of Windows on more current Windows OS.
And, because VMs are independent of each other, they’re also extremely portable. You can move a VM on a hypervisor to another hypervisor on a completely different machine almost instantaneously.
Because of their flexibility and portability, virtual machines provide many benefits, such as:
- Cost savings—running multiple virtual environments from one piece of infrastructure means that you can drastically reduce your physical infrastructure footprint. This boosts your bottom line—decreasing the need to maintain nearly as many servers and saving on maintenance costs and electricity.
- Agility and speed—Spinning up a VM is relatively easy and quick and is much simpler than provisioning an entire new environment for your developers. Virtualization makes the process of running dev-test scenarios a lot quicker.
- Lowered downtime—VMs are so portable and easy to move from one hypervisor to another on a different machine—this means that they are a great solution for backup, in the event the host goes down unexpectedly.
- Scalability—VMs allow you to more easily scale your apps by adding more physical or virtual servers to distribute the workload across multiple VMs. As a result you can increase the availability and performance of your apps.
- Security benefits— Because virtual machines run in multiple operating systems, using a guest operating system on a VM allows you to run apps of questionable security and protects your host operating system. VMs also allow for better security forensics, and are often used to safely study computer viruses, isolating the viruses to avoid risking their host computer.
Get started with virtual machines
Discover Azure cloud compute and learn how to create and deploy VMs from an Azure technical expert.