AWS re:Invent has just happened, and with it a huge number of announcements across the whole AWS portfolio. As usual there were so many announcements that its a bit hard to keep track of them all, so I thought I would pull out most of the announcements related to BI and take a look at what they are and why they can help us. I might also pull out a few announcements from the weeks leading up to re:Invent so that they don’t get lost in the noise.
Another thing to note, as re:Invent only happened a few days ago I have not had enough time to use any of the new service/feature announcements, so all the following information is just based on public information from Amazon.
Keynotes and talks: What’s the noise about AWS
Before we get into the fun of announcements lets get a general feel for the things discussed in the re:Invent keynote’s and other discussions around re:Invent.
Data is getting bigger
Something talked about a lot at re:Invent is the amount of data being generated is growing very quickly and the tool to handle this data and its ever growing size are very important. A number of the announcements at re:Invent was dealing with this pace of data growth, as AWS continue to see big data and unstructured data as an important direction to cloud.
Enterprise is over the big databases
Andy Jassy doesn’t like Oracle, which we already know, but he seems to be stretching his view out to their enterprise cousins database Microsoft’s SQL Server. Jassy spent a chunk of his keynote at AWS re:Invent talking about how some of the bigger AWS customers have been spending a lot of their efforts moving out of SQL Server and Oracle DB onto their open source, licence free alternatives like Postgres and Amazon Aurora. While we know Jassy has a lot of dislike for these services, there were some graphs and user stories to back up these claims, but I’m still taking it all with a grain of salt.
Right, on to the announcements!
Machine Learning / AI
As it has been for the last few years Machine Learning / AI is a large topic at re:Invent. Let’s have a look at the announcements
SageMaker is now on the AWS Marketplace
SageMaker, AWS managed Machine Learning Training, Building and Running platform (TensorFlow, MxNet, PyTorch) now has integration with the AWS Marketplace. This means that you can purchase (or download free) pre-built, pre-tested, and already loved models from companies that have the time and resources to verify this models. This reduces the time to get machine learning into your stack to zero. On the flip-side if you have a model that you have built and you think other people could benefit from you can now sell that on the marketplace!
Amazon Forecast is a pre-build machine learning model that ingests your data and creates forecasts based on the patterns it discovers in the dataset. Based on the technology that Amazon built for use on Amazon.com, Forecast can handle almost any type of dataset and help you make predictions on the future of those metrics, all without having to learn or invest in machine learning practices. Just pump in your data and get going!
Amazon SageMaker Neo
Part of the issue with building cloud based AI models is that the data you want to run those models on is not always in the cloud. To address that issue SageMaker Neo allows you to train Machine Learning models once, build them out and run them anywhere, with a good boost in performance. Neo allows SageMaker models to run natively in the cloud, or on connected edge devices, all with device specific compilation boosts. This saves time and money! SageMaker Neo is available in N. Virginia, Oregon, Ohio, and Ireland to start with.
Amazon SageMaker Ground Truth
While we often talk about the difficulty of training and managing Machine Learning models, an often skipped part of ML is building up an accurate training data set. This is where Ground Truth comes in. Ground Truth gives you a platform to go through data and produce training datasets by annotating live datasets in an iterative and easy to manage process through the Ground Truth UI. Ground Truth then learns from these annotations and can often label the rest of the dataset with minimal human intervention.
Ground Truth also integrates with Amazon Mechanical Turk to give you that on-demand workforce to power through those larger datasets.
Amazon SageMaker Reinforcement Learning
Amazon SageMaker now supports Reinforcement Learning models. Useful for models with not a lot of data, Reinforcement Learning allows a new class of training method. Now supported in SageMaker. Fun!
Timeseries databases have proven to be quite useful in recent times, with many use cases requiring a high-performing, highly available database. While previously the only answer on AWS was to host your own InfluxDB install (or a competitor), Amazon are now introducing Amazon Timestream, a managed, scalable timeseries database. It’s in preview now, but should be ready for prime-time soon!
Amazon DynamoDB On-Demand
A great announcement for anyone who is struggling to predict the usage of their DynamoDB Tables, DynamoDB on demand allows you to just pay for what capacity you use on your DynamoDB tables. Don’t use any capacity? Don’t pay for any. Created the next Facebook? DynamoDB will autoscale up to handle the sudden influx of interest. While slightly more expensive per capacity unit than the provisioned mode, it is way more flexible, and the use what you pay for probably means you will end up saving costs overall. The best bit: it’s available today, even for existing tables. Give it a go!
Amazon DynamoDB Transactions
DynamoDB now supports multi-document/multi-table transactions. Making it easier to keep data integrity is an active concurrent world DynamoDB Transactions reduce the amount of code you need to write, which is always a good thing. Transactions are available everywhere DynamoDB is available, now!
Amazon Quantum Ledger Database (QLDB)
Amazon Quantum Ledger Database keeps a cryptographically verified transaction history of all the changes made to its tables. QLDB is a fully managed relational SQL-based data source that keeps an immutable history of everything that has happened to it so you can be sure you have that tight audit trail. Solving the issues that are generally addressed by blockchain solutions, QLDB takes a similar but higher performance approach in a managed, AWS-ey fashion. QLDB lets you get going without having to manage and implement blockchain solutions. On top of that QLDB is serverless, so you only pay for what you need. In preview now, auditing data history for everyone soon.
Amazon Aurora Global Database
Amazon Aurora Global Database is a globally enabled version of the MySQL version of amazon’s Aurora database. Able to span multiple regions, Aurora Global Database is a great way to store data for services that span the globe. It has a low-latency replication so you data is quick no matter where you access it. Available now in US East, US West and Ireland.
AWS Transfer for SFTP
Some data is stuck in a fashion that only SFTP will allow you to get it out. This is where AWS Transfer for SFTP comes in, a managed services that creates a SFTP server that is secured by IAM and is backed by S3. Simply FTP files to it and they end up in that S3 bucket. Fun!
DataSync is a data transfer service that automates syncing file systems between servers and S3 buckets or EFS drives. Using all the things AWS knows about transferring files to the cloud DataSync operates at up to 10x faster than open-source tools, and only costs the traditional data transfer costs. DataSync can operate; once, in active sync mode, or on a schedule to keep that data storage in sync with the cloud. Now it’s much easier to move those files!
AWS S3 Glacier Deep Archive
Acting as a cheaper solution than on-premise tape, AWS S3 Glacier Deep Archive allows you to store those cold archival files at about $1/TB. This is currently the cheapest available cloud storage solution. You can now throw away those dying tape drives!
Visualising data is a large part of the reason for doing a lot of that expensive data processing.
ML Insights for Amazon QuickSight
Amazon QuickSight is getting a little more intelligent with the addition of ML Insights. This allows QuickSight to discover trends, outliers, forecast, summarize data without any human interactions. Another cool feature of ML Insights is auto-narratives, which reads your data and outputs a plain-language interpretation of that dataset. ML Insights is currently in preview.
Dashboard embedding and API’s for Amazon QuickSight
QuickSight dashboards can now be embedded in applications which allows you to have interactive charting and analytics in your applications without having to investigate charting libraries.
Data Extract / Processing
Big companies have big data, and need AWS to have tools to make that big data useful.
AWS Lake Formation
Everyone should have a data lake, or at least that’s what we are being told. Certainly it’s useful to move all the siloed data from across your organisation into somewhere your analysts can have access to it. Creating a data lake on the other hand can be a basket of worries, making sure the data is durable, highly available, and secure so that only the people and tools who should have access do have access. This is where AWS Lake Formation comes in. With the ability to create a data lake on AWS with a few clicks, hook it up to your data sources, and use it to create security and access control settings, AWS Lake Formation takes the worry and hard work out of fishing in the lake.
Paper documents are annoying. OCR solutions are generally hard, and inaccurate. To solve this issue we have Amazon Textract, a context aware OCR based textual extraction tool that can process documents in a changing and dynamic world. You no longer need to update rules lists when documents change, and results are now reliable. Textract will even extract data store in tables so you can create calculated fields with ease! In preview now, coming to the paper world soon.
AWS Security Hub
Security Hub is a new one-stop-shop for all the security services on AWS, first or third party. It even functions across accounts. Integrating all the services you need to ensure your environments are secure and compliant. Being secure has never been so easy.
Compute / Networking
Compute is always important, for BI we like a bit of ETL or data parsing in our stack, for everyone else it’s often the main thing to think about. Here are some of the compute announcements from re:Invent!
New EC2 Instance Types
Two new major classes of CPU’s having been added to EC2; AMD, and AWS Graviton Processors.
AMD processors are now available in select regions and offer a 10% cost savings for workloads that are compatible with AMD processors (amd64 instruction set). As this functionality is 99% similar as the Intel processors, this gives a nice option for people looking to save a few cents in compute costs.
The AWS Graviton processor is a new ARM-based processor developed by AWS themselves. This processor is 45% cheaper on average than the equivalent amd64 based processor, and is a great way to cheaply run ARM compatible applications.
AWS Global Accelerator
AWS Global Accelerator is a network layer services that allows you to route global traffic using the AWS global backbone. Using geo rules and health checks, and ingesting traffic close to your users Global Accelerator enables AWS customers to create fast, low-latency, reliable applications that use Amazons mature and globally managed network infrastructure.
AWS Transit Gateway
Transit Gateway is a service that allows AWS users to connect lots of VPC, VPN and Direct Connect networks through a single gateway. This allows organisations to have a simple single management point for all their connected AWS services, accounts and on-premises locations. All the networks need to do is connect to the Transit Gateway and the traffic will be allowed to flow to where it needs to go. With routing policies defined on the Gateway itself, this reduces operational costs, and management time. Cross network communication has never been this simple.
AWS App Mesh
AWS App Mesh is a Service Mesh for microservices running on AWS. AWS App Mesh takes the network traffic controls and routing and simplifies the process of getting Microservices up and running, and monitoring them while they are running. It can route traffic around failures and integrates with EC2, ECS, and EKS.
AWS Cloud Map
AWS Cloud Map is a managed service discovery service for your AWS Cloud. Define a number of custom names for your cloud resources and AWS Cloud Map keeps the location of these services updated, and helps route your applications traffic to those services. With built-in health checking AWS Cloud Map allows you to keep the world turning with minimal management. Available now in all commercial AWS regions, so give it a go!
AWS Marketplace supports containers
You can now purchase and download pre-defined container services from the AWS Marketplace, just as you could grab EC2-based systems.
Hybrid Cloud / Cloud Management
Hybrid cloud technologies are great way to manage those old systems while allowing the innovation of cloud based systems.
Bringing a little AWS to your own data center, AWS Outposts enables on premise hardware to be managed using the same API’s and systems that we have become accustomed to from AWS. Order the AWS Outpost hardware, hook it up, and now your datacenter appears as a new AWS region in the console. Amazing! AWS Outposts will be available to everyone in the second half of 2019
AWS Control Tower
Control tower sets up and manages a multi-account AWS environment with all the controls and best practices one expects when starting out on AWS.
Whew that’s a lot of things, and that’s only some of the things talked about at re:Invent, for the rest take a look at this list.
I hope to have some blogs on these services once we get started using them
Coffee to Code