<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>DevOps.com</title>
	<atom:link href="http://devops.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://devops.com</link>
	<description>Finishing what agile development started</description>
	<lastBuildDate>Fri, 07 Jun 2013 02:45:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='devops.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/e5f778b3364da9b9ff6f10d82e0a36cb?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>DevOps.com</title>
		<link>http://devops.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://devops.com/osd.xml" title="DevOps.com" />
	<atom:link rel='hub' href='http://devops.com/?pushpress=hub'/>
		<item>
		<title>Fresh Stats Comparing Traditional IT and DevOps Oriented Productivity</title>
		<link>http://devops.com/2013/06/04/fresh-stats-comparing-traditional-it-and-devops-oriented-productivity/</link>
		<comments>http://devops.com/2013/06/04/fresh-stats-comparing-traditional-it-and-devops-oriented-productivity/#comments</comments>
		<pubDate>Tue, 04 Jun 2013 16:00:20 +0000</pubDate>
		<dc:creator>Martin J. Logan</dc:creator>
				<category><![CDATA[CaseStudy]]></category>
		<category><![CDATA[devops]]></category>
		<category><![CDATA[devops productivity]]></category>
		<category><![CDATA[devops statistics]]></category>
		<category><![CDATA[devops stats]]></category>

		<guid isPermaLink="false">http://devops.com/?p=519</guid>
		<description><![CDATA[This is a guest post by Krishnan Badrinarayanan (@bkrishz), ZeroTurnaround The word “DevOps” has been thrown around quite a lot lately. Job boards are awash with requisitions for “DevOps Engineers” with varying descriptions. What is DevOps, really? In order to better under what the fuss is all about, we surveyed 620 engineers to examine what [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=519&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>This is a <a title="Guest Posting Guidelines " href="http://devops.com/guest-posting-guidelines/">guest post</a> by Krishnan Badrinarayanan (<a href="http://twitter.com/bkrishz">@bkrishz</a>), ZeroTurnaround</em></p>
<p>The word “DevOps” has been thrown around quite a lot lately. Job boards are awash with requisitions for “DevOps Engineers” with varying descriptions. What is DevOps, really?</p>
<p>In order to better under what the fuss is all about, we surveyed 620 engineers to examine what they do to keep everything running like clockwork – from day-to-day activities, key processes, tools and challenges they face. The survey asked for feedback on how much time is spent improving infrastructure and setting up automation for repetitive tasks; how much time is typically spent fighting fires and communicating; and what it takes to keep the lights on. We then compared responses belonging to those from traditional IT and DevOps teams. Here are the results, in time spent each week carrying out key activities:</p>
<p><a href="http://devopsdotcom.files.wordpress.com/2013/06/image-for-devops-com-byline.png"><img src="http://devopsdotcom.files.wordpress.com/2013/06/image-for-devops-com-byline.png?w=895&#038;h=702" alt="devops productivity stats" width="895" height="702" class="alignnone size-full wp-image-523" /></a></p>
<h2>Conclusions we can draw from the results</h2>
<p><b>DevOps oriented teams spend slightly more time automating tasks</b></p>
<p>Writing scripts and automating processes have been a part of the Ops playbook for decades now. The likes of shell scripts, Python and PERL, are often used to automate repetitive configuration tasks but with the newer tools like Chef and Puppet, Ops folk perform more sophisticated kinds of automation such as <a href="http://zeroturnaround.com/labs/pragmatic-devops-virtualization-provisioning-with-vagrant-chef/">spinning up virtual machines and tailoring them to the app’s needs</a> using Chef or Puppet recipes.</p>
<p><b>Both Traditional IT and DevOps oriented teams communicate actively</b></p>
<p>Respondents belonging to a DevOps oriented team spend 2 fewer hours communicating each week, possibly because DevOps fosters better collaboration and keeps Dev and Ops teams in sync with each other. However, Dev and Ops folk in Traditional IT teams spend over 7 hours each week communicating. This active dialogue helps them better understand challenges, set expectations and triage issues. How much of this communication can be deemed inefficient is subjective, but it is necessary to get both teams to onboard. Today, shared tooling, instant messaging, task managers and social tools also help bring everyone closer together in real-time.</p>
<p><b>DevOps oriented teams fight fires less frequently</b></p>
<p>A key tenet of the DevOps methodology is to embrace the possibility of failures, and be prepared for it. With alerts, continuous testing, monitoring and feedback loops that expose vulnerabilities and key metrics, teams are enabled to act quickly and proactively. Programmable infrastructure and automated deployments provide a quick recovery while minimizing user impact.</p>
<p><b>DevOps oriented teams spend less time on administrative support</b></p>
<p>This could be a result of better communication, higher level of automation and the availability of self-service tools and scripts for most support tasks. If there’s a high level of provisioning and automation, there’s no reason why admin support shouldn’t dwindle down to a very small time drain. It could also mean that members of DevOps oriented teams help themselves more often than expecting to be supported by the system administrator.</p>
<p><b>DevOps oriented teams work fewer days after-hours</b></p>
<p>We asked our survey takers how many days per week they work outside of normal business hours. Here’s what we learned:</p>
<table border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top"><b>Days worked after hours</b></td>
<td valign="top"><b>Traditional IT</b></td>
<td valign="top"><b>DevOps Oriented</b></td>
</tr>
<tr>
<td valign="top">Average</td>
<td valign="top">2.3</td>
<td valign="top">1.5</td>
</tr>
<tr>
<td valign="top">Standard Deviation</td>
<td valign="top">1.7</td>
<td valign="top">1.7</td>
</tr>
</tbody>
</table>
<p>According to these results, DevOps team members lead a more balanced life, spend more time on automation and infrastructure improvement, spend less time fighting fires, and work less hours (especially outside of normal business hours).</p>
<p>DevOps-related initiatives came up on top in 2012 and 2013, according to our survey. There’s a strong need for agility to respond to ever-changing and expanding market needs. Software teams are under pressure to help meet them and the chart above validates its benefits.</p>
<p><b>Rosy Stats, but hard to adopt</b><b></b></p>
<p><b>How we got here</b></p>
<p>IT Organizational structures &#8211; typically Dev, QA, and Ops &#8211; have come to exist for a reason. The dev team focuses on innovating and creating apps. The QA team ensures that the app behaves as intended. The operations team keeps the infrastructure running &#8211; from the apps, network, servers, shared resources to third party services. Each team requires a special set of skills in order to deliver a superior experience in a timely manner.</p>
<p><b>The challenge</b></p>
<p>Today’s users increasingly rely on software and expect it to meet their constantly evolving needs 24/7, whether they’re at their desks or on their mobile devices. As a result, IT teams need to respond to change and release app updates quickly and efficiently without compromising on quality. Fail to do so, and they risk driving users to competitors or other alternatives.</p>
<p>However, releasing apps quickly comes with its own drawbacks. It strains functionally siloed teams and often results in software defects, delays and stress. Infrequent communication across teams further exacerbates the issue, leading to a snowball effect of finger-pointing and bad vibes.</p>
<p><b>Spurring cultural change</b></p>
<p>Both Dev and Ops teams bring a unique set of skills and experience to software development and delivery. DevOps is simply a culture that brings development and operations teams together so that through understanding each others’ perspectives and concerns, they can build and deliver resilient software products that are production ready, in a timely manner. DevOps is not NoOps. Nor is it akin to putting a Dev in Ops clothing. DevOps is synergistic, rather than cannibalistic.</p>
<p><b>DevOps is a journey</b></p>
<p>Instilling a DevOps oriented culture within your organization is not something that you embark on and chalk off as success at the end. Adopting DevOps takes discipline and initiative to bring development and operations teams together. Read up on how other organizations approach adopting DevOps as a culture and learn from their successes and failures. Put to practice what makes sense within your group. Develop a maturity model that can guide you through your journey.</p>
<p>The goal is to make sure that dev and ops are on the same page, working together on everything, toward a common goal: continuous delivery of working software without handoffs, hand-washing, or finger-pointing.</p>
<p><b>Support the community and the cause</b></p>
<p>Dev and Ops need to look introspectively to understand their strengths and challenges, and look for ways to contribute towards breaking down silos. Together, they should seek to educate each other, culturally evolve roles, relationships, incentives, and processes and put end user experience first.</p>
<p>The DevOps community is small but burgeoning, and it’s easy to find ways to get involved, like with the community-driven explosion of<a href="http://devopsdays.org/"> DevOpsDays</a> conferences that occur around the world.</p>
<p><b>Set small goals to be awesome</b></p>
<p>Teams should collaborate to set achievable goals and milestones that can get them on the path to embracing a DevOps culture. Celebrate small successes and focus on continuous improvement. Before you know it, you will surely but gradually reap the benefits of bringing in a DevOps approach to application development and delivery.</p>
<p><b>Start here</b><b></b></p>
<p>For deeper insights into IT Ops and DevOps Productivity with a focus on people, methodologies and tools,<a href="http://zeroturnaround.com/rebellabs/ops/it-ops-devops-productivity-report-2013/"> download a 35-page report</a> filled with stats and charts.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/devopsdotcom.wordpress.com/519/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/devopsdotcom.wordpress.com/519/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=519&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://devops.com/2013/06/04/fresh-stats-comparing-traditional-it-and-devops-oriented-productivity/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ca9a5341a9cf42e4a190816db94a88ca?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">scrumtrilescent</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2013/06/image-for-devops-com-byline.png" medium="image">
			<media:title type="html">devops productivity stats</media:title>
		</media:content>
	</item>
		<item>
		<title>The State of DevOps: Accelerating Adoption</title>
		<link>http://devops.com/2013/04/29/the-state-of-devops-accelerating-adoption/</link>
		<comments>http://devops.com/2013/04/29/the-state-of-devops-accelerating-adoption/#comments</comments>
		<pubDate>Mon, 29 Apr 2013 16:00:35 +0000</pubDate>
		<dc:creator>Martin J. Logan</dc:creator>
				<category><![CDATA[CaseStudy]]></category>
		<category><![CDATA[HighLevel]]></category>

		<guid isPermaLink="false">http://devops.com/?p=499</guid>
		<description><![CDATA[By James Turnbull (@kartar), VP of Technology Operations, Puppet Labs Inc. A sysadmin’s time is too valuable to waste resolving conflicts between operations and development teams, working through problems that stronger collaboration would solve, or performing routine tasks that can &#8211; and should &#8211; be automated. Working more collaboratively and freed from repetitive tasks, IT [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=499&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>By James Turnbull (<a href="https://twitter.com/kartar" rel="nofollow">@kartar</a>), VP of Technology Operations, Puppet Labs Inc.</em> </p>
<p>A sysadmin’s time is too valuable to waste resolving conflicts between operations and development teams, working through problems that stronger collaboration would solve, or performing routine tasks that can &#8211; and should &#8211; be automated. Working more collaboratively and freed from repetitive tasks, IT can &#8211; and will &#8211; play a strategic role in any business.</p>
<p>At Puppet Labs, we believe DevOps is the right approach for solving some of the cultural and operational challenges many IT organizations face. But without empirical data, a lot of the evidence for DevOps success has been anecdotal. </p>
<p>To find out whether DevOps-attuned organizations really do get better results, Puppet Labs partnered with IT Revolution Press <a href="http://itrevolution.com/" rel="nofollow">IT Revolution Press</a> to survey a broad spectrum of IT operations people, software developers and QA engineers. </p>
<p>The data gathered in the <a href="https://puppetlabs.com/solutions/devops/">2013 State of DevOps Report</a>  proves that DevOps concepts can make companies of any size more agile and more effective. We also found that the longer a team has embraced DevOps, the better the results. That success &#8211; along with growing awareness of DevOps &#8211; is driving faster adoption of DevOps concepts.</p>
<p><strong>DevOps is everywhere</strong></p>
<p>Our survey tapped just over 4,000 people living in approximately 90 countries. They work for a wide variety of organizations: startups, small to medium-sized companies, and huge corporations.</p>
<p>Most of our survey respondents &#8211; about 80 percent &#8211; are hands-on: sysadmins, developers or engineers. Break this down further, and we see more than 70 percent of these hands-on folks are actually in IT Ops, with the other 30 percent in development and engineering.</p>
<p><strong>DevOps orgs ship faster, with fewer failures</strong></p>
<p>DevOps ideas enable IT and engineering to move much faster than teams working in more traditional ways. Survey results showed:</p>
<ul>
<li><strong>More frequent and faster deployments.</strong> High-performing organizations deploy code 30 times faster than their peers. Rather than deploying every week or month, these organizations deploy multiple times per day. Change lead time is much shorter, too. Rather than requiring lead time of weeks or months, teams that embrace DevOps can go from change order to deploy in just a few minutes. That means deployments can be completed up to 8,000 times faster.</li>
<li><strong>Far fewer outages.</strong> Change failure drops by 50 percent, and service is restored 12 times faster.</li>
</ul>
<p>Organizations that have been working with DevOps the longest report the most frequent deployments, with the highest success rates. To cite just a few high-profile examples, Google, Amazon, Twitter and Etsy are all known for deploying frequently, without disrupting service to their customers.</p>
<p><strong>Version control + automated code deployment = higher productivity, lower costs &amp; quicker wins</strong></p>
<p>Survey respondents who reported the highest levels of performance rely on version control and automation:</p>
<ul>
<li>89 percent use version control systems for infrastructure management</li>
<li>82 percent automate their code deployments</li>
</ul>
<p>Version control allows you to quickly pinpoint the cause of failures and resolve issues fast. Automating your code deployment eliminates configuration drift as you change environments. You save time and reduce errors by replacing manual workflows with a consistent and repeatable process. Management can rely on that consistency, and you free your technical teams to work on the innovations that give your company its competitive edge.</p>
<p><strong>What are DevOps skills?</strong></p>
<p>More recruiters are including the term DevOps in job descriptions. We found a 75 percent uptick in the 12-month period from January 2012 to January 2013. Mentions of  DevOps as a job skill increased 50 percent during the same period.</p>
<p>In order of importance, here are the skills associated with DevOps:</p>
<ul>
<li><strong>Coding &amp; scripting.</strong> Demonstrates the increasing importance of traditional developer skills to IT operations.</li>
<li><strong>People skills.</strong> Acknowledges the importance of communication and collaboration  in DevOps environments.</li>
<li><strong>Process re-engineering skills.</strong> Reflects the holistic view of IT and development as a single system, rather than as two different functions. </li>
</ul>
<p>Interestingly, experience with specific tools was the lowest priority when seeking people for DevOps teams. This makes sense to us: It’s easier for people to learn new tools than to acquire the other skills.</p>
<p>It makes sense on a business level, too. After all, the tools a business needs will change as technology, markets and the business itself shift and evolve. What doesn’t change, however, is the need  for agility, collaboration and creativity in the face of new business challenges.</p>
<p>&#8212;-<br />
About the author:</p>
<p><a href="http://devopsdotcom.files.wordpress.com/2013/04/james-turnbull-portrait.jpeg"><img src="http://devopsdotcom.files.wordpress.com/2013/04/james-turnbull-portrait.jpeg?w=100&#038;h=100" alt="James Turnbull portrait" width="100" height="100" class="alignnone size-full wp-image-500" /></a>A former IT executive in the banking industry and author of five technology books, James has been involved in IT Operations for 20 years and is an advocate of open source technology. He joined Puppet Labs in March 2010.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/devopsdotcom.wordpress.com/499/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/devopsdotcom.wordpress.com/499/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=499&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://devops.com/2013/04/29/the-state-of-devops-accelerating-adoption/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ca9a5341a9cf42e4a190816db94a88ca?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">scrumtrilescent</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2013/04/james-turnbull-portrait.jpeg" medium="image">
			<media:title type="html">James Turnbull portrait</media:title>
		</media:content>
	</item>
		<item>
		<title>Data Driven Observations on AWS Usage from CloudCheckrs User Survey</title>
		<link>http://devops.com/2013/04/02/amazon-web-services-usage-survey-results-for-march-2013-by-cloudcheckr/</link>
		<comments>http://devops.com/2013/04/02/amazon-web-services-usage-survey-results-for-march-2013-by-cloudcheckr/#comments</comments>
		<pubDate>Tue, 02 Apr 2013 11:58:07 +0000</pubDate>
		<dc:creator>Martin J. Logan</dc:creator>
				<category><![CDATA[CaseStudy]]></category>

		<guid isPermaLink="false">http://devops.com/?p=466</guid>
		<description><![CDATA[This is a guest post by Aaron Klein from CloudCheckr We were heartened when AWS made Trusted Advisor free for the month of March. This was an implicit acknowledgement of what many have long known: AWS is complex and can be challenging for users to provision and control their AWS infrastructure effectively. We took the [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=466&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>This is a <a title="Guest Posting Guidelines " href="http://devops.com/guest-posting-guidelines/">guest post</a> by Aaron Klein from CloudCheckr</em></p>
<p>We were heartened when AWS made Trusted Advisor free for the month of March. This was an implicit acknowledgement of what many have long known: AWS is complex and can be challenging for users to provision and control their AWS infrastructure effectively.</p>
<p>We took the AWS announcement as an opportunity to conduct an internal survey of our customers’ usage. We compared the initial assessments of 400 of our users’ accounts against our 125+ best practice checks for proper configurations and policies. Our best practice checks span 3 key categories: Cost, Availability, and Security.  We limited our survey to users with 10 or more running EC2 instances.  In aggregate, the users were running more than 16,000 EC2 instances.</p>
<p>We were surprised to discover that nearly every customer (99%) experienced at least one serious exception.  Beyond this top level takeaway, our primary conclusion was that controlling cost may grab the headlines, but users also need to button up a number of availability and security issues.</p>
<p>When considering availability, there were serious configuration issues that were common across a high percentage of users. Users repeatedly failed to optimally configure Auto Scaling and ELB. The failure to create sufficient EBS snapshots was an almost universal issue.</p>
<p>Although users passed more of our security checks, the exceptions which did arise were serious. Many of the most commons security issues were found in configurations for S3, where nearly 1 in 5 users allowed unfettered access to their buckets through “Upload /Delete” or “Edit Permissions” set to everyone. As we explained in an earlier whitepaper, anyone using a simple bucket finder tool could locate and access these buckets.</p>
<p>Beyond the numbers, we also interviewed customers to gather qualitative feedback from users on some of the more interesting data points.</p>
<p>If the findings of this survey sparks questions about how well your AWS account is configured, CloudCheckr offers a free account that you can set up in minutes.  Simply enter read only credentials from your AWS account and CloudCheckr will assess your configurations and policies in just a few minutes:  <a href="https://app.cloudcheckr.com/LogOn/Registration" rel="nofollow">https://app.cloudcheckr.com/LogOn/Registration</a></p>
<p align="center"><b>Conclusions by Area</b></p>
<p><b>Conclusions based upon Cost Exceptions:</b></p>
<p>As noted, our sample was comprised of 16,047 instances. The sample group spent a total of $2,254,987 per month on EC2 (and its associated costs) for average monthly cost per customer of $7516. Of course, we noted the mismatch between quantity and cost – spot instances represent 8% of the quantity but only 1.4% of the cost. This is due to the significantly less expensive price of spot instances compared to on demand.</p>
<p>When we looked at the Cost Exceptions, <b>we found that 96% of all users experienced at least 1 exception</b> (with many experiencing multiple exceptions). In total, we found that users who adopted our recommended instance sizing and purchasing type were able to save an average of $3974 per month for an aggregate total of $1,192,212 per month.</p>
<p>This suggested that <b>price optimization remains a large hurdle for AWS users who rely on native AWS tools</b>. Users consistently fail to optimize purchasing and also fail to optimize utilization. These combined issues meant that the average customer pays nearly twice as much as necessary for resources to achieve proper performance for their technology.</p>
<p>To further examine this behavior, we interviewed a number of customers.  We interviewed customers who exclusively purchased on-demand and customers who used multiple purchasing types.</p>
<p>Here were their answers (summarized and consolidated):</p>
<ul>
<li>Spot instances worry users – there is a general concern of: “what if the price spikes and my instance is terminated?” This fear exists despite the fact that spikes occur very rarely, warnings are available, and proper configuration can significantly mitigate this “surprise termination” risk.</li>
<li>It is difficult and time consuming to map the cost scenarios for purchasing reserved instances. The customers who did make this transition had cobbled together home grown spreadsheets as a way of supporting this business decision.  The ones who didn’t make this effort made a gut estimate that it wasn’t worth the time.  AWS was cost effective enough and the time and effort for modeling the transition was an opportunity cost taken away from building and managing their technology.</li>
<li>The intricacies of matching the configurations between on demand instances and reserved instances while taking into consideration auto scaling and other necessary configurations were daunting. Many felt it was not worth the effort.</li>
<li>Amazon&#8217;s own process for regularly lowering prices is a deterrent to purchasing RIs. This is especially true for RIs with a 3 year commitment.  In fact, within the customers who did purchase RI, none expressed a desire to purchase RIs with a 3 year commitment. All supported their decision by referencing the regular AWS price drops combined with the fact that they could not accurately predict their business requirements 3 years out.</li>
</ul>
<p><b>Conclusions based upon Availability Exceptions:</b></p>
<p>We compared our users against our Availability best practices and found that <b>nearly 98% suffered from at least 1 exception</b>. We hypothesized that this was due to the overall complexity of AWS and interviewed some of our users for confirmation. Here is what we found from those interviews:</p>
<ul>
<li>Users were generally surprised with the exceptions. They believed that they “had done everything right” but then realized that they underestimated the complexity of AWS.</li>
<li>Users were often unsure of exactly why something needed to be remedied. The underlying architecture of AWS continues to evolve and users have a difficult time keeping up to speed with new services and enhancements.</li>
<li>AWS dynamism played a large role in the number of exceptions. Users commented that they often fixed exceptions and, after a week of usage, found new exceptions had arisen.</li>
<li>Users remained very happy with the overall level of service from AWS. Despite the exceptions which could diminish overall availability, the users still found that AWS offered tremendous functionality advantages.</li>
</ul>
<p><b>Conclusion bases upon Security Exceptions:</b></p>
<p>Finally, we looked at security. Here we found that <b>44% of our users had at least one serious exception present</b> during the initial scan. The most serious and common exceptions occurred within S3 usage and bucket permissioning. Given the differences in cloud v. data center architecture, this was not entirely surprising. We interviewed our users about this area and here is what we found:</p>
<ul>
<li>The AWS management console offered little functionality for helping with S3 security. It does not provide a use friendly means of monitoring and controlling S3 inventory and usage. In fact, we found that most of our users were surprised when the inventory was reported. They often had 300-500% more buckets, objects and storage than they expected.</li>
<li>Price = Importance, S3 is often an afterthought for users. Because it is so inexpensive users do not audit it as closely as EC2 and other more expensive services and rarely create and implement formal policies for S3 usage.  The time and effort required to log into each region one by one to collect S3 information and download data through the Management console was not worth the effort relative to spend.</li>
<li>Given the low cost and lack of formal policies, team members throw up high volumes of objects and buckets knowing that they can store huge amounts of data at a minimal cost.  Since users did not audit what they had stored, they could not determine the level of security.</li>
</ul>
<p><a href="http://devopsdotcom.files.wordpress.com/2013/03/cloudforestfinal.png"><img class="alignnone size-full wp-image-474" alt="Cloud Computing Forrest" src="http://devopsdotcom.files.wordpress.com/2013/03/cloudforestfinal.png?w=800&#038;h=2600" width="800" height="2600" /></a></p>
<p><a href="http://devopsdotcom.files.wordpress.com/2013/03/aaronklein.jpg"><img class="alignnone size-full wp-image-477" alt="AaronKlein" src="http://devopsdotcom.files.wordpress.com/2013/03/aaronklein.jpg?w=200&#038;h=200" width="200" height="200" /></a><strong>Author Info: </strong>Aaron is the Co-Founder/COO of CloudCheckr Inc. (CCI). With over 20 years of managerial experience and vision, he directs the company&#8217;s operations.</p>
<p>Aaron has held key leadership roles at diverse organizations ranging from small entrepreneurial start-ups to multi-billion dollar enterprises. Aaron graduated from Brandeis University and holds a J.D. from SUNY Buffalo.</p>
<p align="center"><b>Underlying Data Summary</b></p>
<p><b>Cost:                                                                                                       Any exception 96%</b></p>
<p>The total of 16,047 instances was broken in the following categories:</p>
<ul>
<li>On Demand:       78%    (12,517 instances)</li>
<li>Reserved:             14%    (2,247 instances)</li>
<li>Spot:                        8%      (1,284 instances)</li>
</ul>
<p>The instance purchasing was broken down as follows:</p>
<ul>
<li>On Demand:        89.7%  ($2,023,623)</li>
<li>Reserved:             8.9%     ($199,803)</li>
<li>Spot:                        1.4%     ($31,561)</li>
</ul>
<p>Common Cost Exceptions we found:</p>
<ul>
<li>Idle EC2 Instances                                                                                                      36%</li>
<li>Underutilized EC2 Instances                                                                               84%</li>
<li>EC2 Reserved Instance Possible Matching Mistake                                             17%</li>
<li>Unused Elastic IP                                                                                                        59%</li>
</ul>
<p><b> </b></p>
<p><b>Availability:                                                                                              Any exception 98%</b></p>
<p>Here, broken out by service, are some highlights of common and serious exceptions that we found:</p>
<p><b>Service Type:                                                                                      Customers with Exceptions</b></p>
<p><b>EC2:                                                                                                           Any exception   95%</b></p>
<p><b> </b></p>
<ul>
<li>EBS Volumes That Need Snapshots                                                                91%</li>
<li>Over Utilized EC2 Instances                                                                                                   22%</li>
</ul>
<p><b>Auto Scaling:                                                                                              </b><b>Any exception</b><b>   66%</b></p>
<p><b> </b></p>
<ul>
<li>Auto Scaling Groups Not Being Utilized  For All EC2 Instances                       57%</li>
<li>All Auto Scaling Groups Not Utilizing Multiple Availability Zones                34%</li>
<li>Auto Scaling Launch Configuration Referencing Invalid Security Group                   22%</li>
<li>Auto Scaling Launch Configuration Referencing Invalid AMI                           18%</li>
<li>Auto Scaling Launch Configuration Referencing Invalid Key Pair                 16%</li>
</ul>
<p><b>ELB:                                                                                                        </b><b>Any exception</b><b>   42%</b></p>
<p><b> </b></p>
<ul>
<li>Elastic Load Balancers Not Utilizing Multiple Availability Zones                 37%</li>
<li>Elastic Load Balancers With Fewer Than Two Healthy Instances               21%</li>
</ul>
<p><b>Security:</b><b>                                                                                                    Any exception</b><b>   46%</b></p>
<p><b> </b></p>
<p>These were the most common exceptions that we found:</p>
<ul>
<li>EC2 Security Groups Allowing Access To Broad IP Ranges                                 36%</li>
<li>S3 Bucket(s) With &#8216;Upload/Delete&#8217; Permission Set To Everyone                   16%</li>
<li>S3 Bucket(s) With &#8216;View Permissions&#8217; Permission Set To Everyone            24%</li>
<li>S3 Bucket(s) With &#8216;Edit Permissions&#8217; Permission Set To Everyone               14%</li>
</ul>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/devopsdotcom.wordpress.com/466/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/devopsdotcom.wordpress.com/466/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=466&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://devops.com/2013/04/02/amazon-web-services-usage-survey-results-for-march-2013-by-cloudcheckr/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ca9a5341a9cf42e4a190816db94a88ca?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">scrumtrilescent</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2013/03/cloudforestfinal.png" medium="image">
			<media:title type="html">Cloud Computing Forrest</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2013/03/aaronklein.jpg" medium="image">
			<media:title type="html">AaronKlein</media:title>
		</media:content>
	</item>
		<item>
		<title>DevOps &#8211; A Valentine&#8217;s Day Fairy Tale</title>
		<link>http://devops.com/2013/02/14/devops-valentines-day-fairy-tale/</link>
		<comments>http://devops.com/2013/02/14/devops-valentines-day-fairy-tale/#comments</comments>
		<pubDate>Thu, 14 Feb 2013 12:05:01 +0000</pubDate>
		<dc:creator>Martin J. Logan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://devops.com/?p=458</guid>
		<description><![CDATA[This is a guest post by Matt Watson from Stackify Once upon a time two people from different sides of the tracks met and fell in love. Never before had the two people found another person who so perfectly complemented them. Society tried to keep them apart &#8211; “It’s just not how things are done,” [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=458&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<div id="attachment_460" class="wp-caption alignnone" style="width: 1034px"><a href="http://devopsdotcom.files.wordpress.com/2013/02/untitled.png"><img src="http://devopsdotcom.files.wordpress.com/2013/02/untitled.png?w=1024&#038;h=535" alt="DevOps - A Valentine&#039;s Day Fairy Tale" width="1024" height="535" class="size-large wp-image-460" /></a><p class="wp-caption-text">DevOps &#8211; A Valentine&#8217;s Day Fairy Tale</p></div>
<p><em>This is a <a href="http://devops.com/guest-posting-guidelines/" title="Guest Posting Guidelines ">guest post</a> by Matt Watson from <a href="http://www.stackify.com/">Stackify</a></em></p>
<p>Once upon a time two people from different sides of the tracks met and fell in love. Never before had the two people found another person who so perfectly complemented them. Society tried to keep them apart &#8211; “It’s just not how things are done,” they’d say. But times were changing, and this sort of pairing was becoming more socially acceptable.</p>
<p>They met at the perfect time. </p>
<p>Ops had grown tired of the day to day grind of solving other people’s problems. Enough was enough and she needed a change in her life.  A perfectionist and taskmaster to the highest degree, she tended to be very controlling and possessive in relationships. It became more about commands than conversation, making life miserable for both parties. She began to realize she hated change, and felt like she spent most of her time saying “No.” It was time to open up and begin to share to make a relationship work.</p>
<p>Dev, on the other hand, was beginning to mature (a little late in the game, as guys seem to) and trying to find some direction. He had grown tired of communication breakdowns in relationships &#8211; angry phone calls in the middle of the night, playing the blame game, and his inability to meet halfway on anything. He began to realize most of those angry phone calls came as a result of making impulsive decisions without considering how they would impact others. His bad decisions commonly led to performance problems and created a mess for his partners. Dev wanted to more actively seek out everything that makes a healthy relationship work.</p>
<p>The timing was right for a match made in heaven. Dev and Ops openly working and living side by side to make sure both contributed equally to making their relationship work. Ops realized she didn’t have to be so controlling if she and Dev could build trust between one another. Dev realized that he caused fewer fights if he involved Ops in decisions about the future, since those decisions impacted both of them. It was a growing process that caused a lot of rapid and sudden change. Although, like most relationships, they knew it was important to not move too fast, no matter how good it felt.</p>
<p>Dev and Ops dated for about four years before they decided to get married. Now they will be living together and sharing so much more; will their relationship last? How will it need to change to support the additional closeness? But they aren’t worried, they know it is true love and will do whatever it takes to make it work. Relationships are always hard, and they know they can solve most of their problems with a reboot, hotfix, or patch cable.</p>
<p>Will you accept their forbidden love?</p>
<h3>7 Reasons the DevOps Relationship is Built to Last</h3>
<ol>
<li>Faster development and deployment cycles (but don’t move too fast!)</li>
<li>Stronger and more flexible automation with deployment task repeatability</li>
<li>Lowers the risk and stress of a product deployment by making development more iterative, so small changes are made all the time instead of large changes every so often</li>
<li>Improves interaction and communication between the two parties to keep both sides in the loop and active</li>
<li>Aids in standardizing all development environments</li>
<li>DevOps dramatically <a href="http://www.stackify.com/application-support-perfect-devops/" title="Devops simplifies application support">simplifies application support</a> because everyone has a better view of the big picture.</li>
<li>Improves application testing and troubleshooting</li>
</ol>
<p><a href="http://devopsdotcom.files.wordpress.com/2013/02/mat.png"><img src="http://devopsdotcom.files.wordpress.com/2013/02/mat.png?w=150&#038;h=100" alt="mat" width="150" height="100" class="alignleft size-thumbnail wp-image-442" /></a>
<p><em>About the author: Matt Watson is the Founder &amp; CEO of <a href="http://www.stackify.com">Stackify</a>. He has a lot of experience managing high growth and complex technology projects. He is focused on changing the way developers support their production applications with DevOps.</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/devopsdotcom.wordpress.com/458/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/devopsdotcom.wordpress.com/458/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=458&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://devops.com/2013/02/14/devops-valentines-day-fairy-tale/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ca9a5341a9cf42e4a190816db94a88ca?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">scrumtrilescent</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2013/02/untitled.png?w=1024" medium="image">
			<media:title type="html">DevOps - A Valentine&#039;s Day Fairy Tale</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2013/02/mat.png?w=150" medium="image">
			<media:title type="html">mat</media:title>
		</media:content>
	</item>
		<item>
		<title>Defining the Dev and the Ops in Devops</title>
		<link>http://devops.com/2013/02/11/defining-the-dev-and-the-ops-roles-in-devops/</link>
		<comments>http://devops.com/2013/02/11/defining-the-dev-and-the-ops-roles-in-devops/#comments</comments>
		<pubDate>Mon, 11 Feb 2013 20:09:39 +0000</pubDate>
		<dc:creator>Martin J. Logan</dc:creator>
				<category><![CDATA[HighLevel]]></category>

		<guid isPermaLink="false">http://devops.com/?p=426</guid>
		<description><![CDATA[This is a guest post by Matt Watson from Stackify So what does DevOps mean exactly? What is the Dev and what is the Ops in DevOps? The role of Operations can mean a lot of things and even different things to different people. DevOps is becoming more and more popular but a lot of [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=426&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://devopsdotcom.files.wordpress.com/2013/02/devops-define-roles.jpg"><img src="http://devopsdotcom.files.wordpress.com/2013/02/devops-define-roles.jpg?w=400&#038;h=299" alt="development and operations roles not well defined" width="400" height="299" class="alignright size-full wp-image-446" /></a></p>
<p><em>This is a <a href="http://devops.com/guest-posting-guidelines/" title="Guest Posting Guidelines ">guest post</a> by Matt Watson from <a href="http://www.stackify.com/">Stackify</a></em></p>
<p>So what does DevOps mean exactly? What is the Dev and what is the Ops in DevOps?  The role of Operations can mean a lot of things and even different things to different people. DevOps is becoming more and more popular but a lot of people are confused on the topic of who does what. So let’s make a list of the responsibilities operations traditionally has and then figure out what developers should be doing, and which if any responsibilities should be shared.</p>
<h3>Operations responsibilities</h3>
<ul>
<li>IT buying</li>
<li>Installation of server hardware and OS</li>
<li>Configuration of servers, networks, storage, etc…</li>
<li>Monitoring of servers</li>
<li>Respond to outages</li>
<li>IT security</li>
<li>Managing phone systems, network</li>
<li>Change control</li>
<li>Backup and disaster recovery planning</li>
<li>Manage active directory</li>
<li>Asset tracking</li>
</ul>
<h3>Shared Development &amp; Operations duties</h3>
<ul>
<li>	Software deployments</li>
<li>	Application support</li>
</ul>
<p>Some of these traditional responsibilities have changed in the last few years. Virtualization and the cloud have greatly simplified buying decisions, installation, and configuration. For example, nobody cares what kind of server we are going to buy anymore for a specific application or project. We buy great big ones, virtualize them, and just carve out what we need and change it on the fly. Cloud hosting simplifies this even more by eliminating the need to buy servers at all.</p>
<p>So what part of the “Ops” duties should developers be responsible for?</p>
<ul>
<li>	Be involved in selecting the application stack</li>
<li>	Configure and deploy virtual or cloud servers (potentially)</li>
<li>	Deploy their applications</li>
<li>	Monitor application and system health</li>
<li>	Respond to applications problems as they arise.</li>
</ul>
<p>Developers who take ownership of these responsibilities can ultimately deploy and support their applications more rapidly. DevOps processes and tools eliminate the walls between the teams and enables more agility for the business. This philosophy can enable the developers to potentially be responsible for the enter application stack from OS level and up in more a self service mode.</p>
<p>So what does the operations team do then?</p>
<ul>
<li>	Manage the hardware infrastructure</li>
<li>	Configure and monitor networking</li>
<li>	Enforce policies around backup, DR, security, compliance, change control, etc</li>
<li>	Assist in monitoring the systems</li>
<li>	Manage active directory</li>
<li>	Asset tracking</li>
<li>	Other non production application related tasks</li>
</ul>
<p>Depending on the company size the workload of these tasks will vary greatly. In large enterprise companies these operations tasks become complex enough to require specialization and dedicated personnel for these responsibilities. For small to midsize companies the IT manager and 1-2 system administrators can typically handle these tasks. </p>
<p>DevOps is evolving into letting the operations team focus on the infrastructure and IT policies while empowering the developers to exercise tremendous ownership from the OS level and up. With a solid infrastructure developers can own the application stack, build it, deploy it, and cover much if not all of its support. This enables development teams to be more self-service and independent of a busy centralized operations team. DevOps enables more agility, better efficiency, and ultimately a higher level of service to their customers.</p>
<p><a href="http://devopsdotcom.files.wordpress.com/2013/02/mat.png"><img src="http://devopsdotcom.files.wordpress.com/2013/02/mat.png?w=150&#038;h=100" alt="mat" width="150" height="100" class="alignleft size-thumbnail wp-image-442" /></a>
<p><em>About the author: Matt Watson is the Founder &amp; CEO of Stackify. He has a lot of experience managing high growth and complex technology projects. He is focused on changing the way developers support their production applications with DevOps.</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/devopsdotcom.wordpress.com/426/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/devopsdotcom.wordpress.com/426/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=426&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://devops.com/2013/02/11/defining-the-dev-and-the-ops-roles-in-devops/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ca9a5341a9cf42e4a190816db94a88ca?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">scrumtrilescent</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2013/02/devops-define-roles.jpg" medium="image">
			<media:title type="html">development and operations roles not well defined</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2013/02/mat.png?w=150" medium="image">
			<media:title type="html">mat</media:title>
		</media:content>
	</item>
		<item>
		<title>Approaches to Application Release Automation</title>
		<link>http://devops.com/2012/12/10/approaches-to-application-release-automation/</link>
		<comments>http://devops.com/2012/12/10/approaches-to-application-release-automation/#comments</comments>
		<pubDate>Mon, 10 Dec 2012 07:01:21 +0000</pubDate>
		<dc:creator>Martin J. Logan</dc:creator>
				<category><![CDATA[HighLevel]]></category>
		<category><![CDATA[ARA]]></category>
		<category><![CDATA[deployment automation]]></category>
		<category><![CDATA[nolio]]></category>
		<category><![CDATA[release automation]]></category>

		<guid isPermaLink="false">http://devops.com/?p=409</guid>
		<description><![CDATA[This is a guest post by Phil Cherry from Nolio A discussion of process-based, package-based, declarative, imperative and generic approaches to application release automation. Application Release Automation is a relatively new, but rapidly maturing area of IT. As with all new areas there is plenty of confusion around what Application Release Automation really is and [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=409&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://devops.com/2012/12/10/approaches-to-application-release-automation/gears/" rel="attachment wp-att-410"><img src="http://devopsdotcom.files.wordpress.com/2012/12/gears.png?w=207&#038;h=199" alt="gears" width="207" height="199" class="alignleft size-full wp-image-410" /></a></p>
<p><em>This is a <a href="http://devops.com/guest-posting-guidelines/" title="Guest posting guidelines ">guest post</a> by Phil Cherry from <a href="http://www.noliosoft.com/">Nolio</a></em></p>
<p>A discussion of process-based, package-based, declarative, imperative and generic approaches to application release automation.</p>
<p>Application Release Automation is a relatively new, but rapidly maturing area of IT.  As with all new areas there is plenty of confusion around what Application Release Automation really is and the best way to go about it.  There are those who come at it with a very developer-centric mind-set, there are those who embrace the modern DevOps concept and even those who attempt to apply server based automation tools to the application space.</p>
<p>Having worked with many companies of various sizes, technologies, cultures and mind-sets; both as they select an ARA (Application Release Automation) tool and as they move on to implement their chosen tool, I have had many opportunities to assess the various approaches.  In this short blog I will discuss the pro’s and con’s of each approach. </p>
<h3>Package-Based</h3>
<p>Package-based automation is a technique that was originally designed for automating the server layer. Due to its success at this, some have attempted to adapt it to automate the application layer as well.  Packages encapsulate all the changes that need to be performed on a single server, and can include the pre-requisite checks that need to take place, as well as the post-deployment verifications.  When patching a server this makes complete sense, there are no dependencies between the patched server and all the others in the same data centre, and so applying all the required changes for that patch (or patches) in a bundle in one go is possible.  The package can then be applied to all appropriate servers without modification.  At this layer there is little difference between one Windows Server 2008 and the next, even though the applications on top may be completely different.<br />
The benefit of this packaging approach is the easy rollback capability.  If required the package can be easily rolled back to the original server state, but on the other hand it treats each server as an island with no dependency to another server.  It assumes that all changes on that server can be done in one go.  This type of automation is offered by companies like BMC Bladelogic and IBM Tivoli Provisioning Manager (TPM).</p>
<h3>Declarative-Based</h3>
<p>Declarative-based automation comes from a similar mindset to package-based but takes a different route to the solution.  It also originally came out of the need to automate the server layer and a subsequent attempt to apply it to the application layer. With declarative-based automation, the desired state of the server is defined down to every individual configuration item (registry key, dll, config file entry) etc.  Most declarative-based tools require you to describe the desired state by writing what is effectively a piece of ‘code’.  Some solutions, for example Puppet, offer a simplified proprietary DSL (Domain Specific Language) but this does not allow you to do everything, and so keeps Ruby as a backup.  The downside of this is that the user has to learn at least one programming language (or you have to employ people with that knowledge already) and so does not readily open the automation to non-developers.  This approach also has the same downside as package-based automation in that it assumes each server can be configured independently and all in one go.  But it also has the same benefits, in that automatic rollback is conceptually a lot easier.</p>
<h3>Imperative-Based</h3>
<p>Imperative-based automation is more familiar as the structure of the language is closer to traditional programming languages (such as Java, C++, Perl etc).  In this approach a programming language is used to describe what needs to be done to the target servers in a series of steps executed in a specific sequence.  Chef is an example of an imperative automation tool (the programming language which is based on Ruby).  As with declarative-based, the code created (or recipe as it is called in Chef) is still very much focused on making changes to a single isolated server, and the assumption is that those same changes will be applied to multiple servers of the same type.  There is limited understanding of making dependant changes across multiple servers, because that was not required at the server layer.  It is only important when you move up to the application layer.  And of course, the current offerings available still require you to be familiar with, or learn, a programming language to use them.</p>
<h3>Generic and Custom-Built Approaches</h3>
<p>Often, people try to apply generic approaches (such as Powershell, DOS batch scripts, Perl etc) to the task of automation, or even to write their own automation tool using a compiled language such as Java, C++ etc.  As any developer will tell you, they can go and write something that will deploy your application.  They can use their preferred language rather than having to use the language supported by the automation platform being employed.  And they are right, a development team can indeed write a fully capable deployment tool but the question is: does it really benefit the company to take up development time building and maintaining a deployment tool rather than focusing on the development of their own applications?  Even then they will have many issues to face in enabling parallel execution, reporting and auditing, access and permissions control, and importantly synchronising activity across multiple servers.  The original intention of these approaches was once again focused on a specific server and not on the cross-server nature of application deployment.</p>
<h3>Process-Based</h3>
<p>Process-based automation is a different approach, which was created more recently, to address the needs of application release automation.  ARA platforms such as DeployIt and UrbanDeploy and, of course, our own tool Nolio all take this approach. These tools seek to support currently existing application deployment processes, the ones that operators could/would normally step through manually.  The focus is on processes, and the tools allow an operator to define them in a visual way, with an understanding of the cross-server nature of application deployments. Let’s say that you need to do something on an application’s web server, then the application server, then update tables in the database, then a second change to the application server and finally another change to the web server.  With a server-centric approach (such as those discussed above) it is very hard to orchestrate activities across multiple different servers, to synchronise the changes so they happen at the right time in relation to each other, and even to pass information between those servers.  With a process-based system this is straightforward &#8211; you define the different server types, drop the relevant steps onto each one and draw links to define the synchronisation points (i.e. only do the db steps once the application server steps have been completed).</p>
<p><a href="http://devops.com/2012/12/10/approaches-to-application-release-automation/process-based-automation/" rel="attachment wp-att-411"><img src="http://devopsdotcom.files.wordpress.com/2012/12/process-based-automation.png?w=865&#038;h=573" alt="process-based-automation" width="865" height="573" class="alignleft size-full wp-image-411" /></a></p>
<p><small>Diagram 1 – Screenshot of a Nolio deployment process, including activity on 4 different server types.</small></p>
<p>The issues mentioned above such as parallel execution, cross-server synchronisation etc. should already have been dealt with by the ARA platform, rather than having to be created on top of a more generic platform.  As you see from the diagram above, with an ARA platform cross-server synchronisation is simply a case of drawing a dependency link between an action on one server type to an action on another server type.  Now “Stop Application” action on the “Application Server Type” will not run until “Prepare for package distribution” on “Repository node” has been completed.</p>
<p>In addition to this, process-based automation brings the following benefits: there is no need to modify the deployment process to fit the automation tools’ inability to synchronise activity across multiple servers, thus ensuring the consistency between manual and automated approaches is maintained; the defined process can be used as documentation for how the deployment should be done as  it is inherently very readable; there is a detailed and relevant audit trail  because the process is inline with how the deployment would be done manually, and it is much easier to diagnose issues because the process follows the expected steps.</p>
<p>You can read more articles on Application Release Automation on <a href="http://blog.noliosoft.com/">Nolio’s Blog Site</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/devopsdotcom.wordpress.com/409/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/devopsdotcom.wordpress.com/409/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=409&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://devops.com/2012/12/10/approaches-to-application-release-automation/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/ca9a5341a9cf42e4a190816db94a88ca?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">scrumtrilescent</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2012/12/gears.png" medium="image">
			<media:title type="html">gears</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2012/12/process-based-automation.png" medium="image">
			<media:title type="html">process-based-automation</media:title>
		</media:content>
	</item>
		<item>
		<title>Big Data Problems in Monitoring at eBay</title>
		<link>http://devops.com/2012/11/11/big-data-problems-in-monitoring-at-ebay/</link>
		<comments>http://devops.com/2012/11/11/big-data-problems-in-monitoring-at-ebay/#comments</comments>
		<pubDate>Sun, 11 Nov 2012 16:17:13 +0000</pubDate>
		<dc:creator>mattokeefe</dc:creator>
				<category><![CDATA[CaseStudy]]></category>
		<category><![CDATA[Presentation]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Monitoring]]></category>

		<guid isPermaLink="false">http://devops.com/?p=335</guid>
		<description><![CDATA[This post is based on a talk by Bhaven Avalani and Yuri Finklestein at QConSF 2012 (slides). Bhaven and Yuri work on the Platform Services team at eBay. by @mattokeefe This is a Big Data talk with Monitoring as the context. The problem domain includes operational management (performance, errors, anomaly detection), triaging (Root Cause Analysis), [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=335&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>This post is based on a talk by Bhaven Avalani and Yuri Finklestein at <a href="http://qconsf.com/sf2012/">QConSF 2012</a> (<a href="http://qconsf.com/dl/qcon-sanfran-2012/slides/BhavenAvalani_and_YuriFinklestein_AppWatchABigDataApplicationMonitoringSystemForEBay.pdf">slides</a>). Bhaven and Yuri work on the Platform Services team at eBay.<br />
</em><br />
by @mattokeefe</p>
<p>This is a Big Data talk with Monitoring as the context. The problem domain includes operational management (performance, errors, anomaly detection), triaging (Root Cause Analysis), and business monitoring (customer behavior, click stream analytics). Customers of Monitoring include dev, Ops, infosec, management, research, and the business team. How much data? In 2009 it was tens of terabytes per day, now more than 500 TB/day. Drivers of this volume are business growth, SOA (many small pieces log more data), business insights, and Ops automation.</p>
<p>The second aspect is Data Quality. There are logs, metrics, and events with decreasing entropy in that order. Logs are free-form whereas events are well defined. Veracity increases in that order. Logs might be inaccurate.</p>
<p><a href="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-9-58-11-am.png"><img src="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-9-58-11-am.png?w=402&#038;h=214" alt="" title="Data Quality" width="402" height="214" class="aligncenter size-full wp-image-376" /></a></p>
<p>There are tens of thousands of servers in multiple datacenters generating logs, metrics and events that feed into a data distribution system. The data is distributed to OLAP, Hadoop, and HBase for storage. Some of the data is dealt with in real-time while other activities such as OLAP for metric extraction is not.</p>
<p><a href="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-01-03-am.png"><img src="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-01-03-am.png?w=608&#038;h=485" alt="" title="Functional Architecture" width="608" height="485" class="aligncenter size-full wp-image-378" /></a></p>
<p><strong>Logs</strong><br />
How do you make logs less &#8220;wild&#8221;? Typically there are no schema, types, or governance. At eBay they impose a log format as a requirement. The log entry types includes open and close for transactions, with time for transaction begin and end, status code, and arbitrary key-value data. Transactions can be nested. Another type  is atomic transactions. There are also types for events and heartbeats. They generate 150TB of logs per day.</p>
<p><a href="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-06-39-am.png"><img src="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-06-39-am.png?w=723&#038;h=359" alt="" title="Structured Logging" width="723" height="359" class="aligncenter size-full wp-image-380" /></a></p>
<p><strong>Large Scale Data Distribution</strong><br />
The hardest part of distributing such large amounts of data is fault handling. It is necessary to be able to buffer data temporarily, and handle large spikes. Their solution is similar to <a href="https://github.com/facebook/scribe">Scribe</a> and <a href="https://cwiki.apache.org/FLUME/">Flume</a> except the unit of work is a log entry with multiple lines. The lines must be processed in correct order. The Fault Domain Manager copies the data into downstream domains. It uses a system of queues to handle the temporary unavailability of a destination domain such as Hadoop or Messaging. Queues can indicate the pressure in the system being produced by the tens of thousands of publisher clients. The queues are implemented as circular buffers so that they can start dropping data if the pressure is too great. There are different policies such as drop head and drop tail that are applied depending on the domain&#8217;s requirements.</p>
<p><a href="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-14-57-am.png"><img src="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-14-57-am.png?w=426&#038;h=511" alt="" title="Fault Domains" width="426" height="511" class="aligncenter size-full wp-image-383" /></a></p>
<p><strong>Metric Extraction</strong><br />
The raw log data is a great source of metrics and events. The client does not need to know ahead of time what is of interest. The heart of the system that does this is Distributed OLAP. There are multiple dimensions such as machine name, cluster name, datacenter, transaction name, etc. The system maintains counters in memory on hierarchically described data. Traditional OLAP systems cannot keep up with the amount of data, so they partition across layers consisting of publishers, buses, aggregators, combiners, and query servers. The result of the aggregators is OLAP cubes with multidimensional structures with counters. The combiner then produces one gigantic cube that is made available for queries.</p>
<p><a href="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-08-05-am.png"><img src="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-08-05-am.png?w=735&#038;h=553" alt="" title="Distributed OLAP" width="735" height="553" class="aligncenter size-full wp-image-381" /></a></p>
<p><strong>Time Series Storage</strong><br />
<a href="http://oss.oetiker.ch/rrdtool/">RRD</a> was a remarkable invention when it came out, but it can&#8217;t deal with data at this scale. One solution is to use a column oriented database such or <a href="http://hbase.apache.org/">HBase</a> or <a href="http://cassandra.apache.org/">Cassandra</a>. However you don&#8217;t know what your row size should be and handling very large rows is problematic. On the other hand <a href="http://opentsdb.net/">OpenTSDB</a> uses fixed row sizes based on time intervals. At eBay&#8217;s scale with millions of metrics per second, you need to down-sample based on metric frequency. To solve this, they introduced a concept of multiple row spans for different resolutions. </p>
<p><strong>Insights</strong><br />
* Entropy is important to look at; remove it as early as possible<br />
* Data distribution needs to be flexible and elastic<br />
* Storage should be optimized for access patterns</p>
<p><strong>Q&amp;A</strong><br />
Q. What are the outcomes in terms of value gained?<br />
A. Insights into availability of the site are important as they release code every day. Business insights into customer behavior are great too.</p>
<p>Q. How do they scale their infrastructure and do deployments?<br />
A. Each layer is horizontally scalable but they&#8217;re still working on auto-scaling at this time. EBay is looking to leverage Cloud automation to address this.</p>
<p>Q. What is the smallest element that you cannot divide?<br />
A. Logs must be processed atomically. It is hard to parallelize metric families.</p>
<p>Q. How do you deal with security challenges?<br />
A. Their security team applies governance. Also there is a secure channel that is encrypted for when you absolutely need to log sensitive data.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/devopsdotcom.wordpress.com/335/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/devopsdotcom.wordpress.com/335/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=335&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://devops.com/2012/11/11/big-data-problems-in-monitoring-at-ebay/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/170ee1d168d893a7c99041beab64ffe7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mattokeefe</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-9-58-11-am.png" medium="image">
			<media:title type="html">Data Quality</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-01-03-am.png" medium="image">
			<media:title type="html">Functional Architecture</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-06-39-am.png" medium="image">
			<media:title type="html">Structured Logging</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-14-57-am.png" medium="image">
			<media:title type="html">Fault Domains</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-08-05-am.png" medium="image">
			<media:title type="html">Distributed OLAP</media:title>
		</media:content>
	</item>
		<item>
		<title>Release Engineering at Facebook</title>
		<link>http://devops.com/2012/11/08/release-engineering-at-facebook/</link>
		<comments>http://devops.com/2012/11/08/release-engineering-at-facebook/#comments</comments>
		<pubDate>Fri, 09 Nov 2012 02:14:45 +0000</pubDate>
		<dc:creator>mattokeefe</dc:creator>
				<category><![CDATA[CaseStudy]]></category>
		<category><![CDATA[Presentation]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://devops.com/?p=354</guid>
		<description><![CDATA[This post is based on a talk by Chuck Rossi at QConSF 2012. Chuck is the first Release Engineer to work at Facebook. by @mattokeefe Chuck tries to avoid the &#8220;D&#8221; &#8220;O&#8221; word&#8230; DevOps. But he was impressed by a John Allspaw presentation at Velocity 09 &#8220;10+ Deploys Per Day: Dev and Ops Cooperation at [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=354&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>This post is based on a talk by <a href="https://twitter.com/chuckr">Chuck Rossi</a> at <a href="http://qconsf.com/sf2012/">QConSF 2012</a>. Chuck is the first Release Engineer to work at Facebook.</em><br />
by @mattokeefe</p>
<p>Chuck tries to avoid the &#8220;D&#8221; &#8220;O&#8221; word&#8230; DevOps. But he was impressed by a John Allspaw presentation at Velocity 09 &#8220;<a href="http://www.youtube.com/watch?v=LdOe18KhtT4">10+ Deploys Per Day: Dev and Ops Cooperation at Flickr</a>&#8220;. This led him to set up a bootcamp session at Facebook and this post is based on what he tells new developers.</p>
<p><strong>The Problem</strong><br />
Developers want to get code out as fast as possible. Release Engineers don&#8217;t want anything to break. So there&#8217;s a need for a process. &#8220;Can I get my rev out?&#8221; &#8220;No. Go away&#8221;. That doesn&#8217;t work. They&#8217;re all working to make change. Facebook operates at ludicrous speed. They&#8217;re at massive scale. No other company on earth moves as fast with at their scale.</p>
<p>Chuck has two things at his disposal: tools and culture. He latched on to the culture thing after Allspaw&#8217;s talk. The first thing that he tells developers is that they will shepherd their changes out to the world. If they write code and throw it over the wall, it will affect Chuck&#8217;s Mom directly. You have to deal with dirty work and it is your operational duty from check-in to trunk to in-front-of-my-Mom. There is no QA group at Facebook to find your bugs before they&#8217;re released.</p>
<p>How do you do this? You have to know when and how a push is done. All systems at Facebook follow the same path, and they push every day.</p>
<p><strong>How does Facebook push?</strong><br />
Chuck doesn&#8217;t care what your source control system is. He hates them all. They push from trunk. On Sunday at 6p they take trunk and cut a branch called latest. Then they test for two days before shipping. This is the old school part. Tuesday they ship, then Wed-Fri they cherry pick more changes. 50-300 cherry picks per day are shipped.</p>
<p>But Chuck wanted more. &#8220;<a href="https://www.facebook.com/notes/facebook-engineering/ship-early-and-ship-twice-as-often/10150985860363920">Ship early and ship twice as often</a>&#8221; was a post he wrote on the Facebook engineering blog. (check out the funny comments). They started releasing 2x/day in August. This wasn&#8217;t as crazy as some people thought, because the changes were smaller with the same number of cherry picks per day.</p>
<p>About 800 developers check in per week. It keeps growing as they hire more. There&#8217;s about 10k commits per month to a 10M LOC codebase. But the rate of cherry picks per day has remained pretty stable. There is a cadence for how things go out. So you should put most of your effort into the big weekly release. Then lots of stuff crowds in on Wed as fixes come in. Be careful on Friday. At Google they had &#8220;no push Fridays&#8221;. Don&#8217;t check in your code and leave. Sunday and Monday are their biggest days, as everyone uploads and views all the photos from everyone else&#8217;s drunken weekend.</p>
<p>Give people an out. If you can&#8217;t remember how to do a release, don&#8217;t do anything. Just check into trunk and you can avoid the operational burden of showing up for a daily release.</p>
<p>Remember that you&#8217;re not the only team shipping on a given today. Coordinate changes for large things so you can see what&#8217;s planned company wide. Facebook uses Facebook groups for this.</p>
<p><strong>Dogfooding</strong><br />
You should always be testing. People say it but don&#8217;t mean it, but Facebook takes it very seriously. Employees never go to the real facebook.com because they are redirected to <a href="http://www.latest.facebook.com" rel="nofollow">http://www.latest.facebook.com</a>. This is their production Facebook plus all pending changes, so the whole company is seeing what will go out. Dogfooding is important. If there&#8217;s a fatal error, you get directed to the bug report page.</p>
<p>File bugs when you can reproduce them. Make it easy and low friction for internal users to report an issue. The internal Facebook includes some extra chrome with a button that captures session state, then routes a bug report to the right people.</p>
<p>When Chuck does a push, there&#8217;s another step in that developers&#8217; changes are not merged unless you&#8217;ve shown up. You have to reply to a message to confirm that you&#8217;re online and ready to support the push. So the actual build is <a href="http://www.inyour.facebook.com" rel="nofollow">http://www.inyour.facebook.com</a> which has fewer changes than latest.</p>
<p>Facebook.com is not to be used as a sandbox. Developers have to resist the urge to test in prod. If you have a billion users, don&#8217;t figure things out in prod. Facebook has a separate complete and robust sandbox system.</p>
<p>On-call duties are serious. They make sure that they have engineers assigned as point of contact across the whole system. Facebook has a tool that allows quick lookup of on-call people. No engineer escapes this.</p>
<p><strong>Self Service</strong><br />
Facebook does everything in IRC. It scales well with up to 1000 people in a channel. Easy questions are answered by a bot. There is a command to lookup the status of any rev. They also have a browser shortcut as well. Bots are your friends and they track you like a dog. A bot will ask a developer to confirm that they want a change to go out. </p>
<p><strong>Where are we?</strong><br />
Facebook has a dashboard with nice graphs showing the status of each daily push. There is also a test console. When Chuck does the final merge, he kicks off a system test immediately. They have about 3500 unit test suites and he can run one each machine. He reruns the tests after every cherrypick.</p>
<p><strong>Error tracking</strong><br />
There are thousands and thousands of web servers. There&#8217;s good data in the error logs but they had to write a custom log aggregator to deal with the volume. At Facebook you can click on a logged error and see the call stack. Click on a function and it expands to show the git blame and tell you who to assign a bug to. Chuck can also use Scuba, their analysis system, which can show trends and correlate to other events. Hover over any error, and you get a sparkline that shows a quick view of the trend.</p>
<p><strong>Gatekeeper</strong><br />
This is one of Facebook&#8217;s main strategic advantages that is key to their environment. It is like a feature flag manager that is controlled by a console. You can turn new features on selectively and restrain the set of users who see the change. Once they turned on &#8220;fax your photo&#8221; for only Techcrunch as a joke. </p>
<p><strong>Push karma</strong><br />
Chuck&#8217;s job is to manage risk. When he looks at the cherry pick dashboard it shows the size of the change, and the amount of discussion in the diff tool (how controversial is the change). If both are high he looks more closely. He can also see push karma rated up to five stars for each requestor. He has an unlike button to downgrade your karma. If you get down to two stars, Chuck will just stop taking your changes. You have to come and have a talk with him to get back on track.</p>
<p><strong>Perflab</strong><br />
This is a great tool that does a full performance regression on every change. It will compare perf of trunk against the latest branch. </p>
<p><strong>HipHop for PHP</strong><br />
This generates about 600 highly optimized C++ files that are then linked into a single binary. But sometimes they use interpreted PHP in dev. This is a problem that they plan to solve with the PHP virtual machine that they plan to open source.</p>
<p><strong>Bittorrent</strong><br />
This is how they distribute the massive binary to many thousands of machines. Clients contact Open Tracker server for list of peers. There is rack affinity and Chuck can push in about 15 minutes.</p>
<p><strong>Tools alone won&#8217;t save you</strong><br />
The main point is that you cannot tool your way out of this. The people coming on board have to be brainwashed so they buy into the cultural part. You need the right company with support from the top all the way down.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/devopsdotcom.wordpress.com/354/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/devopsdotcom.wordpress.com/354/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=354&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://devops.com/2012/11/08/release-engineering-at-facebook/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/170ee1d168d893a7c99041beab64ffe7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mattokeefe</media:title>
		</media:content>
	</item>
		<item>
		<title>Hacking Culture for Continuous Delivery</title>
		<link>http://devops.com/2012/11/08/hacking-culture-for-continuous-delivery/</link>
		<comments>http://devops.com/2012/11/08/hacking-culture-for-continuous-delivery/#comments</comments>
		<pubDate>Thu, 08 Nov 2012 22:19:38 +0000</pubDate>
		<dc:creator>mattokeefe</dc:creator>
				<category><![CDATA[HighLevel]]></category>
		<category><![CDATA[Presentation]]></category>

		<guid isPermaLink="false">http://devops.com/?p=338</guid>
		<description><![CDATA[This post is based on a new talk by @jesserobbins at QConSF 2012 (slides). Jesse is a firefighter, the former Master of Disaster at Amazon, and the Founding CEO of Opscode, the company behind Chef. by @mattokeefe photo credit: John Keatley Operations at web scale is the ability to consistently create and deploy reliable software [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=338&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>This post is based on a new talk by <a href="https://twitter.com/jesserobbins">@jesserobbins</a> at <a href="http://qconsf.com/sf2012/">QConSF 2012</a> (<a href="http://qconsf.com/dl/qcon-sanfran-2012/slides/JesseRobbins_ChangingCultureBeingAForceForAwesome.pdf">slides</a>). Jesse is a firefighter, the former Master of Disaster at Amazon, and the Founding CEO of <a href="http://qconsf.com/sf2012/">Opscode</a>, the company behind Chef.</em><br />
by @mattokeefe<br />
photo credit: John Keatley</p>
<p><img alt="Jesse Robbins, Firefighter" src="http://cdn.geekwire.com/wp-content/uploads/robbins-firefighter.jpg?7794fe" height="468" width="405" align="right" /></p>
<p>Operations at web scale is the ability to consistently create and deploy reliable software to an unreliable platform that scales horizontally. Jesse created the <a href="http://velocityconf.com/">Velocity conference</a> to explore how to do this, learning from companies that do it well. Google, Amazon, Microsoft, Yahoo built their own automation and deployment tools. When Jesse left Amazon he was stunned at the lack of mature tooling elsewhere. Many companies considered their tools to be &#8220;secret sauce&#8221; that gave them a competitive advantage. Opscode was founded to provide Cloud infrastructure automation. Jesse&#8217;s experience helping other companies down this road led to a set of culture hacks that will help you adopt Continuous Delivery.</p>
<p><a href="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-27-09-am.png"><img src="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-27-09-am.png?w=611&#038;h=459" alt="" title="Automate All The Things" width="611" height="459" class="aligncenter size-full wp-image-385" /></a></p>
<p><strong>Continuous Delivery</strong><br />
Continuous Delivery is the end state of thinking and approaching a wide array of problems in a new way. Big changes to software systems that build up over long periods of time suck. A long time and lots of code changes lead to breakage that is hard to solve. The Continuous Deployment way means small amounts of code deployed frequently. Awesome in theory, but it requires organizational change. The effort is worth it however as the benefits include faster time to value, higher availability, happier teams and more cool stuff. Given this, it is surprising that Continuous Delivery has taken so long to be accepted.</p>
<p>Teams that do Continuous Delivery are much happier. Seeing your code live is very gratifying. You have the freedom to experiment with new things because you aren&#8217;t stuck dealing with large releases and the challenge of getting everything right in one go.</p>
<p>Learning about Continuous Delivery is very exciting, but the reality is that back at the office things are challenging. Organizational change is hard. Let&#8217;s consider a roadmap for cultural change. The first problem is &#8220;it worked fine in test, it&#8217;s Ops&#8217; problem now.&#8221;</p>
<p><a href="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-31-19-am.png"><img src="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-31-19-am.png?w=610&#038;h=454" alt="" title="Ops Problem" width="610" height="454" class="aligncenter size-full wp-image-386" /></a></p>
<p>Ops likes to punish dev for this.</p>
<p><a href="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-34-44-am.png"><img src="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-34-44-am.png?w=706&#038;h=450" alt="" title="Supporting Developers" width="706" height="450" class="aligncenter size-full wp-image-387" /></a></p>
<p>Tools are not enough (even really great tools like Chef!). In order to succeed you have to convince people that you can be trusted and you want to work together. The reason for this is understood, for example see <a href="http://en.wikipedia.org/wiki/Conway's_law">Conway&#8217;s law</a>. Teams need to work together continuously, not just at deploy time.</p>
<p>Choice: discourage change in the interest of stability, or allow change to happen as often as it needs to. Asking the question of which do you choose is better than just making a statement.</p>
<p><strong>Common Attributes of Web Scale Cultures</strong></p>
<ul>
<li>Infrastructure as Code. This is the most important entry point, providing full-stack automation. Commodity hardware can be used with this approach, as reliability is provided in the software stack. Datacenters must have APIs; you can&#8217;t rely on humans to take action. All services including things like DNS have to follow this model. Infrastructure becomes a product, and the app dev team is the customer.</li>
<li>Applications as Services. This means SOA with things like loose coupling and versioned APIs. You must also design for failure, and this is where a lot of teams struggle. Database/storage abstraction is important as well. Complexity is pushed up the stack. Deep instrumentation is critical for both infrastructure and apps.</li>
<li>Dev / Ops as Teams. Shared metrics and monitoring, incident management. Sometimes it is good to rotate devs through the on-call duties so everyone gets experience. Tight integration means a set of tools that integrates tightly with all of the teams. This leads to Continuous Integration, which leads to Continuous Delivery. The Site Reliability Engineer role is important in this model so you have people that understand the system from top to bottom. Finally, thorough testing is important e.g. <a href="http://www.slideshare.net/jesserobbins/ameday-creating-resiliency-through-destruction">GameDay</a>.</li>
</ul>
<p>None of this is new; consider Theory of Constraints, Lean/JIT, Six Sigma, Toyota Production System, Agile, etc. You need to recognize it has to be a cultural change to make it work however. Every org will say &#8220;we can&#8217;t do it that way because&#8230;&#8221; They&#8217;re trying to think about where they are and extrapolate to this new state. It&#8217;s like an elephant (Enterprises) trying to fly. You have to give them a way to think about a way of making incremental evolutionary changes toward the goal.</p>
<p>Cultural change takes a long time. This is the hardest thing. Jesse&#8217;s Rule: Don&#8217;t Fight Stupid, Make More Awesome! Pick your battles and do these <strong>5 things</strong>:</p>
<ul>
<li>Start small and built on trust and safety. The machinery will resist you if you try sweeping change.</li>
<li>Create champions. Attack the least contentious thing first.</li>
<li>Use metrics to build confidence. Create something that you can point to to get people excited. Time to value is a good one.</li>
<li>Celebrate successes. This builds excitement, even for trivial accomplishments. The thing is to create arbitray points where you can look back and see progress.</li>
<li>Exploit Compelling Events. When something breaks it is a chance to do something different. &#8220;Currency to Make Change&#8221; is made available, as <a href="https://twitter.com/allspaw">John Allspaw</a> puts it.</li>
</ul>
<p><strong>Start small</strong></p>
<ul>
<li>Small change isn&#8217;t a threat and it&#8217;s easy to ignore. Too big of a change will meet resistence, so start small.</li>
<li>Just call it an experiment. Don&#8217;t present the change as an all or nothing commitment.</li>
</ul>
<p><strong>Creating Champions</strong></p>
<ul>
<li>Get executive sponsors, starting with your boss</li>
<li>Give everyone else the credit. When people around you succeed, celebrate it.</li>
<li>Give &#8220;Special Status&#8221;. This is magic. Special badges, SRE bomber jackets at Google&#8230; these things are cool and you&#8217;re giving people something they want.</li>
<li>Have people with &#8220;Special Status&#8221; talk about the new awesome. Make them evangelists and create mentor programs to build an internal structure of advocates.</li>
</ul>
<p><strong>Metrics</strong></p>
<ul>
<li>Find KPIs that support change. Hacking metrics is important to drive change. Having KPIs around things like time to value is compelling. Relate shipping code to revenue.</li>
<li>Track and use KPIs ruthlessly. First you show value, then you show the cost of not making the change by laggards. This is the carrot and stick approach.</li>
<li>Tell your story with data. <a href="http://www.ted.com/speakers/hans_rosling.html">Hans Rosling</a> has a great TED talk on this topic. This is the most powerful hack. Include stories about what your competitors are doing. There&#8217;s no other way to make this work.</li>
</ul>
<p><strong>Celebrating Successes</strong></p>
<ul>
<li>Tell a powerful story</li>
<li>Always be positive about people and how they overcame a problem. This is especially important with Ops people who tend to be grumpy.</li>
<li>Never focus on the people who created the problem. Focus instead on the problem itself.</li>
<li>Leave room for people to come to your side. Otherwise you&#8217;ll make enemies. Don&#8217;t fight stupid.</li>
</ul>
<p><strong>Compelling Events</strong></p>
<ul>
<li>Just wait, one will come. Things are never stable. Exploit challenges like compliance or moving to Cloud.</li>
<li>Don&#8217;t say &#8220;I told you so&#8221;, instead ask &#8220;what do we do now?&#8221; Make it safe for people to decide to change.</li>
</ul>
<p>Remember, don&#8217;t fight stupid, make more awesome!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/devopsdotcom.wordpress.com/338/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/devopsdotcom.wordpress.com/338/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=338&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://devops.com/2012/11/08/hacking-culture-for-continuous-delivery/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/170ee1d168d893a7c99041beab64ffe7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mattokeefe</media:title>
		</media:content>

		<media:content url="http://cdn.geekwire.com/wp-content/uploads/robbins-firefighter.jpg?7794fe" medium="image">
			<media:title type="html">Jesse Robbins, Firefighter</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-27-09-am.png" medium="image">
			<media:title type="html">Automate All The Things</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-31-19-am.png" medium="image">
			<media:title type="html">Ops Problem</media:title>
		</media:content>

		<media:content url="http://devopsdotcom.files.wordpress.com/2012/11/screen-shot-2012-11-11-at-10-34-44-am.png" medium="image">
			<media:title type="html">Supporting Developers</media:title>
		</media:content>
	</item>
		<item>
		<title>QCon SF 2012, a DevOps field guide</title>
		<link>http://devops.com/2012/10/13/qcon-sf-2012-a-devops-field-guide/</link>
		<comments>http://devops.com/2012/10/13/qcon-sf-2012-a-devops-field-guide/#comments</comments>
		<pubDate>Sat, 13 Oct 2012 19:18:38 +0000</pubDate>
		<dc:creator>mattokeefe</dc:creator>
				<category><![CDATA[highlights]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[continuous delivery]]></category>
		<category><![CDATA[devops]]></category>

		<guid isPermaLink="false">http://devops.com/?p=318</guid>
		<description><![CDATA[by @mattokeefe I am very excited to be attending QCon SF for the first time Nov 7-9. There is quite a bit of DevOps related content, and I will be live-blogging as much as possible. Meanwhile, here are some notes on what you might look forward to as an attendee. Monday 11/5 Tutorial: Continuous Delivery [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=318&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>by <a href="https://twitter.com/mattokeefe">@mattokeefe</a></p>
<p><a href="http://qconsf.com/"><img src="http://qconsf.com/dl/qcon-sanfran-2012/Web/QCONSF_header_2012.png" alt="QCon SF 2012 logo" /></a></p>
<p>I am very excited to be attending <a href="http://qconsf.com/">QCon SF</a> for the first time Nov 7-9. There is quite a bit of DevOps related content, and I will be live-blogging as much as possible. Meanwhile, here are some notes on what you might look forward to as an attendee.</p>
<p><strong>Monday 11/5</strong></p>
<p>Tutorial: <a href="http://qconsf.com/sf2012/presentation/Continuous+Delivery">Continuous Delivery &#8211; Jez Humble</a><br />
<a href="http://jezhumble.net/">Jez</a> wrote the book on <a href="http://continuousdelivery.com/">Continuous Delivery</a>, which ties Agile and DevOps together into a pipeline of goodness. I saw him present at <a href="http://devops.com/2011/08/29/announcement-camp-devops-conference-in-chicago/">Camp DevOps</a> last year, and it was awesome.</p>
<p><strong>Tuesday 11/6</strong></p>
<p>Tutorial: <a href="http://qconsf.com/sf2012/presentation/Implementing+a+Continuous+Delivery+Pipeline%3A++From+Commit+to+Deploy">Implementing a Continuous Delivery Pipeline: From Commit to Deploy &#8211; John Esser &amp; Dan Gilmer</a><br />
More Continuous Delivery goodness.</p>
<p><strong>Wednesday 11/7</strong></p>
<p>Opening Keynote: <a href="http://qconsf.com/sf2012/presentation/Opening+Keynote%3A+Cool+%26+Useless">Cool &amp; Useless &#8211; Kevlin Henney</a><br />
When I think of DevOps, I often think of addressing concerns around availability, reliability, and performance. There are so many variables with each solution to a given technology problem that it is wise to limit your toolkit just so you can get a handle on these concerns. So, I hope that this talk addresses the &#8220;cool &amp; harmful&#8221; aspect as well in the sense that the more things you try, the more you get burned in production.</p>
<p><a href="http://qconsf.com/sf2012/presentation/The+realtime+web%3A+HTML5+WebSockets%2C+Engine.IO%2C+Socket.IO%2C+SPDY%2C+HTTP2.0+%26+Beyond">The realtime web: HTML5 WebSockets, Engine.IO, Socket.IO, SPDY, HTTP2.0 &amp; Beyond</a> &#8211; <a href="https://twitter.com/rauchg">Guillermo Rauch</a><br />
The realtime web promises a great leap forward in terms of UX. However I wonder how many Ops teams are prepared for some of these new standards and the impacts on infrastructure. For example, <a href="https://devcentral.f5.com/weblogs/macvittie/archive/2012/05/07/y-u-no-support-spdy-yet.aspx">SPDY support is not yet provided in some infrastructure layers</a>.</p>
<p><a href="http://qconsf.com/sf2012/presentation/AppWatch+-+a+big+data+application+monitoring+system+for+eBay">AppWatch &#8211; a big data application monitoring system for eBay &#8211; Bhaven Avalani and Yuri Finklestein</a><br />
Large scale application and infrastructure monitoring&#8230; enough said.</p>
<p><a href="http://qconsf.com/sf2012/presentation/How+not+to+measure+latency">How not to measure latency</a> &#8211; <a href="https://twitter.com/giltene">Gil Tene</a><br />
Measure Everything is a core tenet of DevOps. I always thought measuring latency was as simple as a pair of well-placed calls to System.currentTimeMillis (in Java), but this session&#8217;s agenda suggests otherwise. Gil will demonstrate and discuss some false assumptions and measurement techniques that lead to incorrect results.</p>
<p><a href="http://qconsf.com/sf2012/presentation/Continuous+Happiness">Continuous Happiness &#8211; Chris Kelly</a><br />
This talk should help to drive home the point that DevOps is a cultural movement, not a specific set of tools and processes.</p>
<p><a href="http://qconsf.com/sf2012/presentation/Caching+Hypermedia+APIs">Caching Hypermedia APIs &#8211; Tim Stokes</a><br />
Caching is one of the best ways to improve performance and scalability. However it is easy to get it wrong so I am always trying to learn more about this topic.</p>
<p><strong>Thursday 11/8</strong> featuring an entire <a href="http://qconsf.com/sf2012/tracks/show_track.jsp?trackOID=673">Continuous Delivery track</a></p>
<p>Keynote: <a href="http://qconsf.com/sf2012/presentation/NoSQL%3A+Past%2C+Present%2C+Future">NoSQL: Past, Present, Future</a> &#8211; <a href="https://twitter.com/eric_brewer">Eric Brewer</a><br />
Eric Brewer authored the CAP Theorem, which is frequently referenced in discussions of design decisions related to NoSQL databases.</p>
<p><a href="http://qconsf.com/sf2012/presentation/Changing+Culture+%26+Being+a+Force+for+Awesome">Changing Culture &amp; Being a Force for Awesome</a> &#8211; <a href="https://twitter.com/jesserobbins">Jesse Robbins</a>, Master of Disaster<br />
Jesse is the cofounder of <a href="http://www.opscode.com/">Opscode</a>, the company behind <a href="https://github.com/opscode/chef#readme">Chef</a>. He will be talking about <a href="http://devops.com/2011/03/08/devops-culture-hacks/">hacking culture</a>.</p>
<p><a href="http://qconsf.com/sf2012/presentation/Open+Space%3A+Continuous+Delivery">Open Space: Continuous Delivery</a><br />
An <a href="http://en.wikipedia.org/wiki/Open-space_technology">unconference</a> session, where we decide the discussion topics.</p>
<p><a href="http://qconsf.com/sf2012/presentation/Product+Development+with+Continuous+Experimentation">Product Development with Continuous Experimentation</a> &#8211; <a href="https://twitter.com/hirefrank">Frank Harris</a> and <a href="https://twitter.com/nellwyn">Nell Thomas</a><br />
Etsy is a pioneer of continuous deployment, and this talk will be about how they take advantage of data to drive that process.</p>
<p><a href="http://qconsf.com/sf2012/presentation/Large-Scale+Continuous+Testing+in+the+Cloud">Large-Scale Continuous Testing in the Cloud</a> &#8211; <a href="https://sites.google.com/site/johnjpenix/">John Penix</a><br />
John will talk about how Google runs millions of automated tests per day, using Cloud infrastructure.</p>
<p><a href="http://qconsf.com/sf2012/presentation/Release+Engineering+at+Facebook">Release Engineering at Facebook</a> &#8211; <a href="https://twitter.com/chuckr">Chuck Rossi</a><br />
Chuck, Facebook&#8217;s first release engineer, will describe how they release hundreds of changes every day.</p>
<p><a href="http://qconsf.com/sf2012/presentation/Adopting+Continuous+Delivery">Adopting Continuous Delivery</a> &#8211; <a href="https://twitter.com/jezhumble">Jez Humble</a><br />
Jez will address the organizational, architectural and process factors that are important for adoption of Continuous Delivery.</p>
<p><strong>Friday 11/9</strong> featuring the &#8220;<a href="http://qconsf.com/sf2012/tracks/show_track.jsp?trackOID=674">Architectures you&#8217;ve always wondered about</a>&#8221; track</p>
<p>Keynote: <a href="http://qconsf.com/sf2012/presentation/Race+Conditions%2C+Distribution%2C+Interactions--Testing+the+Hard+Stuff+and+Staying+Sane">Race Conditions, Distribution, Interactions&#8211;Testing the Hard Stuff and Staying Sane</a> &#8211; <a href="https://twitter.com/rjmh">John Hughes</a><br />
This talk will be about new automated testing techniques for the most tricky test scenarios.</p>
<p><a href="http://qconsf.com/sf2012/presentation/Scaling+Pinterest">Scaling Pinterest</a> &#8211; Marty Weiner and Yashwanth Nelapati<br />
A Cloud Ninja and a Cloud Balrog will discuss server management on EC2, amongst other things.</p>
<p><a href="http://qconsf.com/sf2012/presentation/Cloud+Computing+at+Google">Cloud Computing at Google</a> &#8211; <a href="https://twitter.com/randyshoup">Randy Shoup</a><br />
Randy will present design principles for building and maintaining highly-available planet-scale applications in the cloud, including isolation, failure tolerance, testability, and security.</p>
<p><a href="http://qconsf.com/sf2012/presentation/Architecting+for+Continuous+Delivery+at+Ancestry.com">Architecting for Continuous Delivery at Ancestry.com</a> &#8211; John Esser and Russell Barnett<br />
This talk will describe how a service-oriented architecture can support Continuous Delivery.</p>
<p><a href="http://qconsf.com/sf2012/presentation/Uncommon+Sense+-+Scaling+Youtube">Uncommon Sense &#8211; Scaling Youtube</a> &#8211; Mike Solomon<br />
Mike, one of the original YouTube engineers, will outline his philosophy on scaling, testing, and writing code.</p>
<p><a href="http://qconsf.com/sf2012/presentation/Timelines+at+Scale">Timelines at Scale</a> &#8211; <a href="https://twitter.com/raffi">Raffi Krikorian</a><br />
Raffi, who likes to break things at Twitter, will talk about building, managing, and debugging an infrastructure that supports hundreds of millions of users around the world.</p>
<p><img src="http://farm4.staticflickr.com/3532/4083220012_0bbdfbd151.jpg" alt="San Francisco" /><br />
photo: <a href="http://www.flickr.com/photos/dahlstroms/">Håkan Dahlström</a></p>
<p>Besides the conference, I&#8217;m really looking forward to being in San Francisco. So many great restaurants, so little time!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/devopsdotcom.wordpress.com/318/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/devopsdotcom.wordpress.com/318/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=devops.com&#038;blog=20154961&#038;post=318&#038;subd=devopsdotcom&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://devops.com/2012/10/13/qcon-sf-2012-a-devops-field-guide/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/170ee1d168d893a7c99041beab64ffe7?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">mattokeefe</media:title>
		</media:content>

		<media:content url="http://qconsf.com/dl/qcon-sanfran-2012/Web/QCONSF_header_2012.png" medium="image">
			<media:title type="html">QCon SF 2012 logo</media:title>
		</media:content>

		<media:content url="http://farm4.staticflickr.com/3532/4083220012_0bbdfbd151.jpg" medium="image">
			<media:title type="html">San Francisco</media:title>
		</media:content>
	</item>
	</channel>
</rss>
