Monday, February 23, 2015

How Etsy makes Devops work

Etsy, which describes itself as an online “marketplace where people around the world connect to buy and sell unique goods,” is often trotted out as a poster child for Devops. The company latched onto the concepts early and today is reaping the benefits as it scales to keep pace with rapid business growth. Network World Editor in Chief John Dix caught up with Etsy VP of Technical Operations Michael Rembetsy to ask how the company put the ideas to work and what lessons it learned along the way.

Let’s start with a brief update on where the company stands today.

The company was founded and launched in 2005 and, by the time I joined in 2008 (the same year as Chad Dickerson, who is now CEO), there were about 35 employees. Now we have well over 600 employees and some 42 million members in over 200 countries around the world, including over 1 million active sellers. We don’t have sales numbers for this year yet, but in 2013 we had about $1.3 billion in Gross Merchandise Sales.

How, where and when did the company become interested in Devops?
When I joined things were growing in a very organic way, and that resulted in a lot of silos and barriers within the company and distrust between different teams. The engineering department, for example, put a lot of effort into building a middle layer – what I called the layer of distrust – to allow developers to talk to our data bases in a faster, more scalable way. But it turned out to be just the opposite. It created a lot more barriers between database engineers and developers.

Everybody really bonded well together on a personal level. People were staying late, working long hours, socializing after hours, all the things people do in a startup to try to be successful. We had a really awesome office vibe, a very edgy feel, and we had a lot of fun, even though we had some underlying engineering issues that made it hard to get things out the door. Deploys were often very painful. We had a traditional mindset of, developers write the code and ops deploys it. And that doesn’t really scale.

How often were you deploying in those early days?
Twice a week, and each deploy took well over four hours.
"Deploys were often very painful. We had a traditional mindset of, developers write the code and ops deploys it. And that doesn’t really scale."

Twice a week was pretty frequent even back then, no?
Compared to the rest of the industry, sure. We always knew we wanted to move faster than everyone else. But in 2008 we compared ourselves to a company like Flickr, which was doing 10 deploys a day, which was unheard of. So we were certainly going a little bit faster than many companies, but the problem was we weren’t going fast with confidence. We were going fast with lots of pain and it was making the overall experience for everyone not enjoyable. You don’t want to continuously deploy pain to everyone.We knew there had to be a better way of doing it.

Where did the idea to change come from? Was it a universal realization that something had to give?
The idea that things were not working correctly came from Chad. He had seen quite a lot in his time at Yahoo, and knew we could do it better and we could do it faster. But first we needed to stabilize the foundation. We needed to have a solid network, needed to make sure that the site would be up, to build confidence with our members as well as ourselves, to make sure we were stable enough to grow. That took us a year and a half.

But we eventually started to figure out little things like, we shouldn’t have to do a full site deploy every single time we wanted to change the banner on the homepage. We don’t have any more banners on the homepage, but back in 2009 we did. The banner would rotate once a week and we would have to deploy the entire site in order to change it, and that took four hours. It was painful for everyone involved. We realized if we had a tool that would allow someone in member ops or engineering to go in and change that at the flick of a button we could make the process better for everyone.
"I can’t recall a time where someone walked in and said, “Oh my God, that person deployed this and broke the site.” That never happened. People checked their egos at the door."

So that gave birth to a dev tools team that started building some tooling that would let people other than operational folks deploy code to change a banner. That was probably one of the first Devops-like realizations. We were like, “Hey, we can build a better tool to do some of what we’re doing in a full deploy.” That really sparked a lot of thinking within the teams.

Then we realized we had to get rid of this app in the middle because it was slowing us down, and so we started working on that. But we also knew we could find a better way to deploy than making a TAR file and SSH’ing and R-synch’ing it out to a bunch of servers, and then running another command that pulls the server out of the load balancer, unpacks the code and then puts the server back in the load balancer. This used to happen while we sat there hoping everything is ok while we’re deploying across something like 15 servers. We knew we could do it faster and we knew we could do it better.

The idea of letting developers deploy code onto the site really came about toward the end of 2009, beginning of 2010. And as we started adding more engineers, we started to understand that if developers felt the responsibility for deploying code to the site they would also, by nature, take responsibility for if the site was up or down, take into consideration performance, and gain an understanding of the stress and fear of a deploy.

It’s a little intimidating when you’re pushing that big red button that says – Put code onto website –because you could impact hundreds of thousands of people’s livelihoods. That’s a big responsibility. But whether the site breaks is not really the issue. The site is going to break now and then. We’re going to fix it. It’s about making sure the developers and others deploying code feel empowered and confident in what they’re doing and understand what they’re doing while they’re doing it.

So there wasn’t a Devops epiphany where you suddenly realized the answer to your problems. It emerged organically?
It was certainly organic. If development came up with better ideas of how to deploy faster, operations would be like, “OK, but let’s also add more visibility over here, more graphs.” And there was no animosity between each other. It was just making things faster and better and stronger in a lot of ways.

And as we did that the culture in the whole organization begin to feel better. There was no distrust between people. You’re really talking about building trust and building friendships in a lot of ways, relationships between different groups, where it’s like, “Oh, yeah. I know this group. They can totally do this. That’s fine. I’ll back them up, no problem.” In a lot of organizations I’ve worked for in the past it was like, “These people? Absolutely not. They can’t do that. That’s absurd.”
"I didn’t marry my wife the first day I met her. It took me a long time to get to the point where I felt comfortable in a relationship to go beyond just dating. It takes longer than people think and they need to be aware of that because, if it doesn’t work after a quarter or it doesn’t work after two quarters, people can’t just abandon it."

And you have to remember this is in the early days where the site breaks often. So it was one of those things, like, OK, if it breaks, we fix it, but we want reliability and sustainability and uptime. So in a lot of ways it was a big leap of faith to try to create trust between each other and faith that other groups are not going to impact the rest of the people.

A lot of that came from the leadership of the organization as well as the teams themselves believing we could do this. Again, we weren’t an IBM. We were a small shop. We all sat very close to one another. We all knew when people were coming and leaving so it made it relatively easy to have that kind of faith in one another. I can’t recall a time where someone walked in and said, “Oh my God, that person deployed this and broke the site.” That never happened. People checked their egos at the door.

I was going to ask you about the physical proximity of folks. So the various teams were already sitting cheek by jowl?
In the early days we had people on the left coast and on the right coast, people in Minnesota and New York. But in 2009 we started to realize we needed to bring things back in-house to stabilize things, to make things a little more cohesive while we were creating those bonds of trust and faith. So if we had a new hire we would hire them in-house. It was more of a short term strategy. Today we are more of a remote culture than 2009.

But you didn’t actually integrate the development and operations teams?
In the early days it was very separate but there was no idea of separation. Depending upon what we were working on, we would inject ourselves into those teams, which led later to this idea of what we call designated operations. So when John Allspaw, SVP of Operations and Infrastructure, came on in 2010, we were talking about better ways to collaborate and communicate with other teams and John says, “We should do this thing called designated operations.”

The idea of designated ops is it’s not dedicated. For example, if we have a search team, we don’t have a dedicated operations person who only works on search. We have a designated person who will show up for their meetings, will be involved in the development of a new feature that’s launching. They will be injecting themselves into everything the engineering team will do as early as possible in order to bring the mindset of, “Hey, what happens if that fails to this third-party provider? Oh, yeah. Well, that’s going to throw an exception. Oh, OK. Are we capturing it? Are we displaying a friendly error for an end user to see? Etc.”

And what we started doing with this idea of designated ops is educate a lot of developers on how operations works, how you build Ganglia graphs or Nagios alerts, and by doing that we actually started creating more allies for how we do things. A good example: the search team now handles all the on-call for the search infrastructure, and if they are unavailable it escalates to ops and then we take care of it.

So we started seeing some real benefits by using the idea of this designated ops person to do cross-team collaboration and communication on a more frequent basis, and that in turn gave us the ability to have more open conversations with people. So that way you remove a lot of the mentality of, “Oh, I’m going to need some servers. Let me throw this over the wall to ops.”

Instead, what you have is the designated ops person coming back to the rest of the ops team saying, “We’re working on this really cool project. It’s going to launch in about three months. With the capacity planning we’ve done it is going to require X, Y and Z, so I’m going to order some more servers and we’ll have to get those installed and get everything up and running. I want to make everybody aware I’m also going to probably need some network help, etc.”

So what we started finding was the development teams actually had an advocate through the designated ops person coming back to the rest of the ops team saying, “I’ve got this.” And when you have all of your ops folks integrating themselves into these other teams, you start finding some really cool stuff, like people actually aren’t mad at developers. They understand what they’re trying to do and they’re extremely supportive. It was extremely useful for collaboration and communication.

So Devops for you is more just a method of work.

Correct. There is no Devops group at Etsy.

How many people involved at this point?

Product engineering is north of 200 people. That includes tech ops, development, product folks, and so on.

How do you measure success? Is it the frequency of deployments or some other metric?
Success is a really broad term. I consider failure success, as well. If we’re testing a new type of server and it bombs, I consider that a success because we learned something. We really changed over to more of a learning culture. There are many, many success metrics and some of those successes are actually failures. So we don’t have five key graphs we watch at all times. We have millions of graphs we watch.

Do you pay attention to how often you deploy?
We do. I could tell you we’re deploying over 60 times a day now, but we don’t say, “Next year we want to deploy 100 times a second.” We want to be able to scale the number of deploys we’re doing with how quickly the rest of the teams are moving. So if a designated ops or development team starts feeling some pain, we’ll look at how we can improve the process. We want to make sure we’re getting the features out we want to get out and if that means we have to deploy faster, then we’re going to solve that problem. So it’s not around the number of deploys.

I presume you had to standardize on your tool sets as you scaled.
We basically chose a LAMP stack: Linux, Apache, MySQL and PHP. A lot of people were like, “Oh, I want to use CoffeeScript or I want to use Tokyo Cabinet or I want to use this or that,” and it’s not about restricting access to languages, it’s about creating a common denominator so everyone can share experiences and collaborate.

And we wrote Deployinator, which is our in-house tool that we use to deploy code, and we open-sourced it because one of our principles is we want to share with the community. Rackspace at one point took Deployinator and rewrote a bunch of stuff and they were using it as their own deploying tool. I don’t know if they still are today, but that was back in the early days when it first launched.

We use Chef for configuration management, which is spread throughout our infrastructure; we use it all over the place. And we have a bunch of homegrown tools that help us with a variety of things. We use a lot of Nagios and Graphite and Ganglia for monitoring. Those are open-source tools that we contribute back to. I’d say that’s the vast majority of the tooling that ops uses at this point. Development obviously uses standard languages and we built a lot of tooling around that.

As other people are considering adopting these methods of work, what kind of questions should they ask themselves to see if it’s really for them?
I would suggest they ask themselves why they are doing it. How do they think they’re going to benefit? If they’re doing it to, say, attract talent, that’s a pretty terrible reason. If they’re doing it to improve the overall structure of the engineering culture, enable people to feel more motivated and ownership, or they think they can improve the community in which they’re responsible or the product they’re responsible for, that’s a really good reason to do it.

But they have to keep in mind it’s not going to be an overnight process. It’s going to take lots of time. On paper it looks really, really easy. We’ll just drop some Devops in there. No problem. Everybody will talk and it will be great.

Well no. I didn’t marry my wife the first day I met her. It took me a long time to get to the point where I felt comfortable in a relationship to go beyond just dating. It takes longer than people think and they need to be aware of that because, if it doesn’t work after a quarter or it doesn’t work after two quarters, people can’t just abandon it. It takes a lot of time. It takes effort from people at the top and it takes effort from people on the bottom as well. It’s not just the CEO saying, “Next year we’re going to be Devops.” That doesn’t work. It has to be a cultural change in the way people are interacting. That doesn’t mean everybody has to get along every step of the way. People certainly will have discussions and disagreements about how they should do this or that, and that’s OK.

Best Microsoft MCTS Certification, Microsoft MCITP Training at certkingdom.com

Sunday, February 15, 2015

700-603 UCS Invicta for Field Engineers


QUESTION 1
Which type of workload does Iometer that is configured to generate 4 K blocks, 50% read. 50%
write, and 100% random operations represent?

A. Bulk data load of a database table
B. OLTP
C. OLAP multidimensional cube
D. Extract, transform, and load
E. Data warehouse

Answer: B

Explanation:


QUESTION 2
Which technology has had the weakest performance growth since the year 2000?

A. Hard drives
B. Memory
C. Bus
D. Broadband wireless
E. Network
F. Processors

Answer: A

Explanation:


QUESTION 3
Which two systems does Hadoop consists of? (Choose two.)

A. Atlas
B. Map reduce
C. ZFS
D. HDFS
E. A snowflake schema systems engine
F. Astar schema systems engine

Answer: B,D

Explanation:


QUESTION 4
Which settings must be configured on the management interface bond?

A. IP address, subnet mask, MTU,mode,and boot order
B. IP address, subnet mask, MTU, and mode
C. IP address and subnet mask
D. MAC address. IP address, subnet mask, and MTU
E. MAC address, subnet mask,MTU,and mode

Answer: A

Explanation:


QUESTION 5
Which statement about the creation of a new LUN when enabling LUN mirroring is true?

A. The new LUN must be created on a performance node.
B. The new LUN must not reside on the same scaling solution node as the original lun to be
mirrored.
C. The new LUN mustreside on the same scaling solution node as the original LUN to be mirrored.
D. The new LUN must be created on a data reduction node.

Answer: B

Explanation:


Thursday, February 12, 2015

600-503 Designing with Cisco Network Programmability


QUESTION 1
Which two advantages of an overlay network that consists of virtual switches on supervisors
compare with those of physical networks? (Choose two.)

A. Ability to change the logical network topology more easily.
B. Ability to experience higher performance of traffic forwarding.
C. Overlay networks integrate with virtual machines more closely
D. They can support more routing protocols.
E. They are more secure.

Answer: A,C

Explanation:


QUESTION 2
Which two options are challenges to migrate a traditional network to an SDN type of network?
(Choose two.)

A. would cost to replace traditional network devices to new devices
B. would need more operators to run a more complicated network
C. would need operators with more programming skill
D. would need more bandwidth to secure redundant paths
E. would need to remove existing network management tools

Answer: A,C

Explanation:


QUESTION 3
Which statement is an example of a requirement that is not well-formed?

A. The application should provide status messages every 60 seconds.
B. The application user interface must be easy to use.
C. The application must validate that the IP addresses that are input by users are valid IPv4 or
IPv6 addresses.
D. The application must be available for end users between 8am and 8pm EST/EDT, Monday
through Friday.
E. The application should restart within 15 seconds.

Answer: B

Explanation:


QUESTION 4
Which option is a requirement represented in an Agile software development methodology?

A. interviews
B. product functions
C. product requirements document
D. user stories
E. home stenographer

Answer: D

Explanation:


QUESTION 5
Which four components should be considered when gathering business requirements for a
customer project? (Choose four.)

A. alignment to corporate goals
B. compliance regulations
C. development team location
D. commitments to customers
E. supplier capabilities
F. business unit providing the developers

Answer: A,B,D,E

Explanation:



Best Cisco CCNP Training and Cisco 600-503 Certification and more Cisco exams log in to Certkingdom.com

Tuesday, February 10, 2015

300-135 Troubleshooting and Maintaining Cisco IP Networks (TSHOOT)


QUESTION 1
Exhibit:



A network administrator is troubleshooting an EIGRP connection between RouterA, IP address
10.1.2.1, and RouterB, IP address 10.1.2.2. Given the debug output on RouterA, which two
statements are true? (Choose two.)

A. RouterA received a hello packet with mismatched autonomous system numbers.
B. RouterA received a hello packet with mismatched hello timers.
C. RouterA received a hello packet with mismatched authentication parameters.
D. RouterA received a hello packet with mismatched metric-calculation mechanisms.
E. RouterA will form an adjacency with RouterB.
F. RouterA will not form an adjacency with RouterB.

Answer: D,F

Explanation:


QUESTION 2
When troubleshooting an EIGRP connectivity problem, you notice that two connected EIGRP
routers are not becoming EIGRP neighbors. A ping between the two routers was successful. What
is the next thing that should be checked?

A. Verify that the EIGRP hello and hold timers match exactly.
B. Verify that EIGRP broadcast packets are not being dropped between the two routers with the
show ip EIGRP peer command.
C. Verify that EIGRP broadcast packets are not being dropped between the two routers with the
show ip EIGRP traffic command.
D. Verify that EIGRP is enabled for the appropriate networks on the local and neighboring router.

Answer: D

Explanation:


QUESTION 3
Refer to the exhibit.



How would you confirm on R1 that load balancing is actually occurring on the default-network
(0.0.0.0)?

A. Use ping and the show ip route command to confirm the timers for each default network resets
to 0.
B. Load balancing does not occur over default networks; the second route will only be used for
failover.
C. Use an extended ping along with repeated show ip route commands to confirm the gateway of
last resort address toggles back and forth.
D. Use the traceroute command to an address that is not explicitly in the routing table.

Answer: D

Explanation:


QUESTION 4
Which IPsec mode will encrypt a GRE tunnel to provide multiprotocol support and reduced
overhead?

A. 3DES
B. multipoint GRE
C. tunnel
D. transport

Answer: D

Explanation:


QUESTION 5
Which three features are benefits of using GRE tunnels in conjunction with IPsec for building siteto-
site VPNs? (Choose three.)

A. allows dynamic routing over the tunnel
B. supports multi-protocol (non-IP) traffic over the tunnel
C. reduces IPsec headers overhead since tunnel mode is used
D. simplifies the ACL used in the crypto map
E. uses Virtual Tunnel Interface (VTI) to simplify the IPsec VPN configuration

Answer: A,B,D

Explanation:



Sunday, February 1, 2015

Microsoft tells Windows 10 users to uninstall Office

Office conflicts with one of Patch Tuesday's security updates, manager cautions on Twitter

Microsoft today took the unusual step of telling users running Windows 10's Technical Preview to uninstall Office before applying one of December's security updates.

"We just made a tough call after working through the night that I thought I should share with you," wrote Gabe Aul, the engineering general manager for Microsoft's operating system group, in a four-part Twitter understatement Tuesday.

"We have a security update going out today, and the installer fails on 9879 if Office is installed," Aul continued. "Rather than rolling a new fix (losing several days in the process) we're going to publish it as is. The workaround is painful: uninstall Office, install the hotfix, reinstall Office. Sorry. We're working hard to fix."

Aul's mention of "9879" referred to the latest "build" of the preview; Microsoft issued Build 9879 four weeks ago.

Somewhat later, Aul identified the update as KB3022827, the Knowledge Base identifier displayed in Windows Update on the preview. (Computerworld was unable to find an associated page on Microsoft's support site that matched KB3022827.) He also partly retracted his advice to uninstall Office: "Please try to install KB3022827 before the workaround to uninstall Office first. It will work for many, no harm if not," he tweeted.

Several people chimed in on Aul's Twitter feed to say that they had tried the update before uninstalling Office and had no problems.

According to Microsoft, only one of today's seven security updates was to be applied to Windows 10's preview. That update, pegged as MS14-080, patched 14 vulnerabilities in Internet Explorer (IE) 11, the browser bundled with the OS.

Andrew Storms, vice president of security services at New Context, weighed in on Aul's odd workaround.

"There are always upsides and downsides to being on the bleeding edge," Storms said in an interview conducted via instant messaging. "Users who chose to grab the Windows 10 Technical Preview are now stuck between the proverbial rock and a hard place. Today, Microsoft admitted that some number of their users are plagued with Explorer crashes and what's worse, an update that won't be easy to install. I, like Microsoft, hope that these users are adept enough to figure out the workaround/fix on their own."

As Storms said, Microsoft acknowledged that one in eight users of the preview had been unable to install an earlier fix that was supposed to stop crashes of the operating system's Explorer file manager.

"On a shipping OS, if we hit an issue like this we'd normally pull the update," Aul admitted in talking about the Explorer screw-up. "But since the Windows Insider audience is technical, we decided to leave it up while we work on the fix so that people hitting the Explorer crash can get some relief."

Storms echoed Aul's confidence in Windows 10 users' skills. "Preview users are generally the most willing to nuke and repave their systems," Storms said.


Best Microsoft MCTS Certification, Microsoft MCITP Training at certkingdom.com