CloudFront
The issue of latency — the length of time a network request takes to completeits roundtrip — is always a big deal as it relates to network traffic. If it affects minor elements, such as DNS queries, which are quite small and don’t even require a great deal of bandwidth, you can imagine how much it affects actual content — factors such as documents, images, and (heaven forbid) videos.
The solution to this problem occurred with the creation of the content delivery network (CDN), which places servers around the world and allows companies to locate their data on the servers. For example, a company located in the United States could use a CDN to place images in Australia; when an Australia-based user accessed the U.S.-based website, the pages were sent (provisionally) without images, and the images were then placed into the pages on their arrival in Australia. This approach allows important or changeable data to reside in the central location and allows static or infrequently changed large content files to be located near the user.
Overall, the use of CDNs can reduce network latency enormously. As you may expect, their use has grown significantly over time. A number of large CDN providers now provide thousands of endpoint locations around the world. These highly sophisticated solutions can be used to reduce latency for web applications, with users able to specify exactly which geographic locations should be the final destinations for distributed content.
On the other hand, one common challenge regarding CDNs is their complexity, which brings these issues to the fore:
✓ CDNs require sophisticated configuration and tuning. These “highmaintenance” needs tend to limit the use of CDNs to larger, more technically capable IT organizations that can devote a resource to learning the ins and outs of the product.
✓ CDNs can be expensive to use. Sticker shock makes them difficult to afford for small companies and even small groups or projects within larger organizations.
✓ CDNs are typically sold in an enterprise fashion. By enterprise, I mean that customers have to make a lengthy commitment to the service, estimate total usage over the length of the contract, and interact over an extended period before starting to use the service. ✓ CDNs can be overkill. If your organization wants improvement in latency but isn’t looking to implement a highly sophisticated solution, a CDN isn’t the best option.
In sum, CDNs are incredibly important and useful, but many who could potentially benefit from the technology are prevented from leveraging them because of cost and complexity issues.Two years ago, Amazon launched its attempt to address this dispiriting state of affairs: CloudFront. CloudFront is easy to use and inexpensive, and it makes CDN technology available to entire new user bases that were previously unable to use existing CDN solutions.
CloudFront features these capabilities:
✓ It serves both static and dynamic content from CloudFront. Static content is served by S3, whereas dynamic content is served from EC2 instances. Static content can be downloaded or streamed to the content user.
✓ It supports three content protocols — HTTP, HTTPS, and RTMP. You’d expect HTTP and HTTPS, but RTMP — the protocol used to stream Adobe Flash–based videos — is a nice addition.
✓ Content can be made publicly available or restricted to certain users. Content control is helpful in situations where you want to make content available only to employees or company partners. In a further extension of this content-control feature, you can create an Origin Access Identity (OAI) to restrict access to your CloudFront objects so that only someone getting a special URL can access the object. The access can be further restricted to only being available to access from specific IP addresses and for a limited time to the special URL. This controlled access is typically used by organizations for commercial reasons to ensure that content access is restricted to subscribers or made available for a limited time.
✓ Content can be set with an expiration date. It’s an easy way to set things up so that, after a certain date, the content is no longer available. For short-lived content, such as certain kinds of marketing campaigns, this enables control of how long the content is available or ensures that content that is served up by CloudFront is the most recent version (or“freshest,” in CDN-speak).
✓ Content access can be logged. The ability to log content access means that the content owner can easily track how CloudFront data is being used.
Using CloudFront
In contrast to the more established commercial CDN alternatives, using CloudFront is straightforward. You merely use the AWS Management Console or API to define a distribution. You then associate the distribution with the origin of the content. The origin can be either S3 or EC2; I focus on S3 here. If you want, you can set additional restrictions as discussed earlier, along with an expiration period, which is 24 hours by default. You set permissions on the origin to allow public access (unless you want to restrict permissions so that only certain people can access the content). That’s it.
CloudFront returns an identifier URL for you to use to enable access to your content. The identifier takes a form similar to this: d111111abcdef8.cloudfront.net You use this identifier along with the name of the specific object you want served up to deliver it from CloudFront. So you may identify a JPEG image of a cat on your website as d111111abcdef8.cloudfront.net/catimage.jpg. When someone accesses your website and wants to see a picture of the kitty, the call to that URL would return the image from the nearest location to the requestor.You can create a CNAME alias to make the CloudFront identifier appear as though it’s part of another URL. You can then map the CloudFront identifier I just mentioned to mask your use of CloudFront: www.yourcompanydomain.com/images
That’s all that’s required to set up a CloudFront distribution. When someone accesses an object that’s part of a CloudFront distribution, CloudFront checks to see whether the object is located in a CloudFront cache near the requestor. If the object is in the cache, CloudFront serves it up from there. If it isn’t in the local cache, CloudFront fetches it from the Origin S3 bucketand brings it into the local cache and then serves it up to the requestor. There after, CloudFront returns the object from the local cache for requestst hat are geographically nearby. If the expiration time on the object copy in the local cache has passed, CloudFront checks to see whether the Origin object has changed. If it has, it fetches the object into the local cache; if not, it returns the object copy that’s in the local cache.
CloudFront scope
CloudFront itself is a global service — using it automatically places content around the world (excepting any edge locations that you identify as wishing to not have your data placed in). The source of the CloudFront data is regionallyscoped; so, for example, you may use CloudFront to distribute your video content throughout the world, so it would be globally available; however, the bucket that contains your video is located in a particular region.
CloudFront cost
The cost of network traffic from CloudFront is only slightly higher than the cost to stream the same traffic directly from S3. For the first 10 terabytes(TB) of network traffic per month, the cost ranges from $.12 (North America)to $.25 (South America). This fee drops to as low as $.02 at volumes above 5petabytes (PB).
In addition to network traffic, Amazon charges for access requests, on a per-10,000 access request basis, ranging from $.0075 (North America) to $.016(South America). Access requests are pretty much what they sound like —requests to retrieve data managed by CloudFront.
You can reduce your network traffic costs if you restrict the number of edge locations that your content is cached in, and you can save money by committing to a certain volume of traffic each month. Though CloudFront pricing is certainly attractive, Cloudfront’s true selling point is its reputation for flexibility and ease of use when compared to the established CDN providers.