Friday, December 9, 2011

Performance based IT Shop Part 3: Architect Level

IT problems usually need to be handled by the network engineer or Systems Engineer. These people are the craftsman of the IT trade. However, the solutions no matter how robust should be run through the Architect. Every IT department should have at least one person who sees and implements the big picture. This big picture is knowing the overall business goals and the limits of technology, but also the external governance and regulatory issues too.

The CIO sets the vision for the big picture. The Architect executes. In our business, while all the roles are valuable and important , the role of the Architect is critical to ensure that business goals and objectives are being met in a highly effective, efficient, and cost sensitive manner. These are the folks creating the next generation IT infrastructure. The CIO and the Purchasing person are just rubber stamping it… ah, I mean Approving it as Service Desk Request. In my previous Blogs, I talked about IT shops having a hodge-podge set of tools. The goal of the Architect is to take what they already have and make it work or blueprint a plan to make it work better.

The Architect can filter the tech speak to business speak. They are able to translate the Key Performance Indicators (KPI) to the business services and identify what is important to the IT goals and objectives. They are in alignment with the CIO vision. They tend ask the questions;

Where is performance effected the most and Why?

Is computing capacity enough to handle the current and growth situations?

Is there effectual support for the end user?

What is impacting SLAs and why?

Does the solution offer synergies and solutions to each business segment or LOB (Line of Business)?

They look to our diagnostic and analytical products like OpManager, Applications Manager, Netflow Analyzer, Eventlog Analyzer, Firewall Analyzer and IT360. Sure, they want to know the day to day of what's up and down and the response times, but they look at these metrics as to the relationship of the business goals. They are interested in the trend reporting and the inter-dependencies among IT infrastructure components. They have this 6th sense, this spatial reasoning ability to provide Foresight into the IT organization. The Architect may not have a say in the staff hiring, but they are certainly aware of the utilization of staff. Where and why it's performing...or more importantly why it's not performing. They are just giving the CIO ammo to fight the good fight for staffing.

One of these Architects who turned consultant is Sean Freeman, CEA. He admits he drank the ManageEngine Coolaid a few years ago...and liked it. He has 20 years experience as an implementor of IT infrastructure at a government contractor and mid / large enterprises. At our past ManageEngine Users Conference, I caught up to him and miked him up.

http://www.manageengine.com/products/eventlog/testimonials.html

One of his first stories is he would schedule a meeting with himself every Tuesday AM. He would pour through the ManageEngine reports from the last week and look for trends. Look at trending vs. real-time stats. How long has an issue been degrading? How long to close tickets? Human error is still the largest contributor to IT problems. He stressed the importance to have a strong change management mechanism. ManageEngine Device Expert helps automate the device configuration changes, but also keep the human intervention in check. Then with Eventlog Analyzer, he was able to find the moment in time to isolate the issue.

Weeks later, I interviewed him again. Beyond the proactive alerting and troubleshooting perspective, he said security was a constant concern. Who is hitting us? How much of a target are we? What's the Risk Exposure?

Part of the architecting process was setting up Service Desk type of services and workflows. He defines how to deal with Change Requests. Who is supposed to be aware of the situation, who is to approve, test and deploy.

Finally, he said it is a matter of taking control of your environment, being accountable and being able to report back to the business units. Excelling at the operational level empowers the strategic level of IT. Full circle.

Friday, October 7, 2011

Data Mining of LTE Performance Management

You don’t need a crystal ball to know that demand for telecommunications and data-bandwidth requirements is exploding. LTE standards address this huge demand for higher bandwidth, lower latency, and advanced communication services. In turn, it’s more important than ever that Element Management Systems (EMSs) and Network Management Systems (NMSs) properly control network devices to ensure calls go through, video gets viewed, online games perform, and more.

From a service provider’s viewpoint, there’s greater competitive pressure on revenue per subscriber. 3G and 4G networks also force you to greatly reduce operational costs (OPEX), meaning the networking devices themselves must be highly intelligent and do less operational work to keep them up and running. LTE network handles more data and more services. It also means more devices. Operators simply can’t throw more people at these problems. So this presents challenges from both an EMS and NMS perspective.

In LTE networks, many devices need to be managed, increasing the potential points of failure or degradation. These devices include a pool of mobility management entity (MME) devices, a serving gateway (SGW) and an eNodeB cluster, in addition to core and backhaul networks.

Once devices are deployed in the network, they broadcast themselves to the Element Management System. These devices tend to be chatty and send lots of data about their physical condition, health, and performance. The EMS implications of these changes are broad. Certainly Fault Management and Configuration Management are affected, but for this article I’ll focus on the impact on Performance Management.

For Performance, mobile providers turn to the 3GPP as their industry standard for the KPIs (Key Performance Indicators) that determine the health of their devices and pinpoint issues that need to be resolved quickly. The data collection mechanics can vary; it could be poll data via SNMP, SOAP/XML or SQL or a custom data sources like CSV files. The EMS aggregates the data and visualizes it in a meaningful way. Common KPI examples are call-session management or call-success/failure rates. This data will be crunched for further QoS and Service Assurance purposes and sent northbound to OSS/BSS systems.

In one case, Viasat has built a Next Gen LTE Satellite System. ViaSat provides a ground Based Beam-Forming (GBBF) system comprised of the CMS (Control and Management System), UBS (Uplink Beacon Station), and Gateway for Boeing's mobile satellite communications, which beam the signals to multiple ground gateways. There are not many ground gateways, but each one generates a massive amount of data to process the analog-to-digital conversion, signal processing and LTE performance KPIs while constantly performing positioning with the satellite beam.

In short, LTE presents unprecedented challenges. This enormous increase in bandwidth traffic means more infrastructure devices and media servers. The amount of health and performance data is immense. This distributed data is mined and analyzed by the EMS to drive the Operators’ business goals … which is making sure the call goes through, the data arrives and the customer is satisfied.

Thursday, August 18, 2011

Performance based IT Shop Part 2

Not all IT problems come under the domain of the network engineer. In my previous Blog, I talked about IT shops having a hodge-podge set of tools. There are various reasons, but the real inefficiency is when these tools perform the same functions. There becomes a time and need to look at the IT problems from different perspectives. A few examples below:


Kenn Nied, Senior Network Engineer at WA State Board for Community and Technical Colleges, illustrates this encounter. While looking at OpManager from a networking point of view, the operator sees alerts that a few switches and a firewall are unresponsive. Is it faulty equipment or an attack? Then turning to a Security mindset, he looks at ManageEngine Device Expert to see real time and historical configuration changes. In one case, it was identified that there was a Firewall rule change made and realized it was a misconfiguration that caused the switch unresponsive. Diagnostic time was minimal.

Albert E. Whale, CHS CISA CISSP, Senior Technology & Security Director for ABS Computer Technology, Inc. explains the security aspect further. When you are managing the security of a business, there are several essential tools needed to manage the environment. There is a need to get a better handle on the design, information flow and stability in the environment. First is a baseline review of all of the Network devices. ManageEngine Device Expert captures the current configuration of the network switches and firewalls. It's an invaluable tool for managing change control on configurations, and also evaluating all of the configurations at a glance. Continuing from the baseline report, both the ManageEngine EventLog Analyzer and Firewall Analyzer determine bottle necks in network throughput and attack information within the Enterprise. Being proactive on security allows for protection before break-ins occur.

Bill Duffy, CTO of Northwind Technology describes the compliance angle. IT departments are faced with compliance oversight irrespective of whether its internal audit and risk management or external regulatory bodies overseeing a particular industry share common goals in meeting these requirements:

* Ability to incorporate aims of compliance reporting into overall monitoring and system administration strategy to optimize technology investments as requirements change and grow.
* Need to reduce the time spent on compliance and audit reporting.
* Use monitoring toolset to proactively manage risk across the organization.
* Demonstrate adherence to compliance controls with clear, objective and easily accessible evidence.

Central to achieving these aims is finding a comprehensive suite of tools that covers all areas of IT security and infrastructure and provides easy access to administrators and auditors. Moreover, it is paramount to provide a rich reporting framework to address ad-hoc and historical data requests as part of evidence gathering during audits. IT departments meeting compliance need to show service availability, IT administration staff activity tracking, change management, asset management, access control, as well as audit trails and logging (security, system, applications, maintenance etc).

The ManageEngine suite of products is unique in being able to effectively bridge the IT landscape to meet these compliance demands. By utilizing ManageEngine ServiceDesk Plus, OpManager and AD Manager Plus as well as modules for AD Audit Plus and Asset Explorer in an integrated fashion, we are able to provide a complete compliance approach streamlined to limit audit and administration burdens on human and system resources while delivering a risk management solution and satisfying audit controls.





Wednesday, August 3, 2011

Performance based IT Shop

Some companies make the insightful IT business decisions because they have the right data, processes and software. Because ManageEngine fits into the software bucket, I’ll address this straight up. It never fails to amaze me is many IT departments have a hodge-podge set of tools. Recently, I ran into a company that is using What’s up, Altiris, some basic MRTG and Tivoli…and it was only giving them up / down status. They also purchased Applications Manager and were liking the results.

I’ve heard this silo, multi-tool story many times. It happens for a variety of reasons. It is usually based on their IT infrastructure maturity level, the evolution their needs at the time or the case of IT decision makers coming and going. The attitude of the moment is: “Got a problem, I’ll solve it!” Reminds me of that line in the Vanilla Ice song...

Management software is not cheap and can cause neck and back problems (swiveling head back and forth to look at multiple consoles). There became a point where they wanted to become a performance-based IT Shop. They learned more about Applications Manager, Added VMWare and Storage monitoring, then added Service Desk to aid in their trouble ticketing and incident and change management processes.

There is a time when intuition based trouble shooting does not scale. Data needs to be collected to get a sense of what’s going on in the IT infrastructure. Then one must identify strength and weaknesses and measure progress against goals and historical data. All of which supports good decision making.

Selecting Metrics to Predict Performance

IT Metrics should be defined to fit the individual need. Not all infrastructures are the same. Within ManageEngine products, one can collect hundreds of arcane metrics. In some cases, IT shops are fire fighting all day and no one is aware of the performance metrics. Managing and controlling the IT metrics has big implications. Downtime and loss of productivity definitely puts a hit on the financial bottom line. Just selecting just a few critical metrics is key to moving toward a performance driven organization. Visit the metrics continually to align with the decision making strategy. Then, make the metrics visible to all to see. Some of our customers put up a dashboard in high traffic areas. People became more aware and active in understanding the goals of the IT strategy, thus making everyone more accountable.

Below are customer examples to drive the point home.

Jamie Gilbert, Director, CIO of CD Baby, the largest online distributor of independent music is using ManageEngine Applications Manager and Service Desk. He said there is an expectation within the organization of no downtime. Uptime metrics and SLA reporting for long term trending for site performance using URL sequence testing is invaluable. Not only performance driven, he also uses it for troubleshooting analysis.

In a previous position, he implemented Applications Manager to monitor 450 real and virtual servers in a mixed Windows and Linux environment with MS-SQL and Oracle databases. He experienced issues with a new application running on Apache, tomcat, and Java. While using real time performance reporting in Applications Manager along with long term trending and comparative analysis reporting across servers, they were able to hone in on the root cause of the issue. The root cause ended up being application programming issue in conjunction with tomcat connection limitations and JVM memory allocation. It was a multifaceted problem and Applications Manager made it possible to see the problems very easily and allowed the team to come up with a path to resolution.

Darren Qualls, CTO of Premier Global Technologies, user of ManageEngine IT360 explains database performance this way. Slow performing databases can be extremely tricky to chase down. An example would be a 9 Terabytes of SQL server data and throw a $20k piece of hardware at it. The likelihood is you’re still going to have problems. There are a few common issues you will run into with database servers. In most cases, you will want to start with lock waits. This is one of the standard metrics for any product. There are so many ways you can mess up record locking and not even know it for a year or two.

In 90% of the cases, record lock issues are only a drop in the bucket. The next thing I run into is the disks. Slow disk access will take a half million dollar blade System to its knees EVERY time! There are so many things that I have to categorically rate as self imposed; incorrect normalization of data, bugs in code, incorrect commit placement or parameters, etc. Even underpowered hardware with incorrect initial specs, organic growth, expired systems will cause problems. Another is telecom issues that can be anything that revolves externally around the system, network setup or remote pulls on queries for reports.

These are your common 3 server setups you’ll need for network maps and traffic monitoring to isolate the data to determine the issue. Do not skimp, without it you may end up taking about 3 times the effort to resolve it.

Monday, July 25, 2011

Who is taking the lead to the Clouds? Telcos, ISPs, MSPs or Enterprises?

If a book seller like Amazon can succeed as a cloud provider, why are telecoms so slow to get into the game? 
 
I suspect the barriers are more cultural than technical. Telecoms seem cautious to jump into new markets. The Cloud is attracting some big industry players, but that’s not the real reason. Competition is not the fear.  It is the memories of the dotcom crash and how telecoms didn’t succeed in the Internet services space.  ISPs and MSPs filled that void.
Cloud is a pure network infrastructure play, and telecoms should feel comfortable there; it’s where they are making their living. What’s more, telecom service providers know a great deal about managing software in big data centers – so the stars are aligned. 
However, this management software is leveling the playing field.  These ISPs and MSPs are building out the data centers just like their telecom brothers.  And larger Enterprises are comfortable running their own data centers too.  The Cloud looks like a data center, acts like a data center, even smells like one.
Application Performance Monitoring is the key to controlling the Cloud 
Great progress has been made in the development of robust tools and APIs to support the cloud, so there’s no reason why a Telco, ISP or MSP can’t enter the cloud business in a few short months – and do it affordably.
One key enabler is application performance monitoring (APM). You simply must have real-time visibility into the health and capacity of web servers, databases, application servers, virtual and cloud environments so you can turn up more infrastructures before the end user experience degrades.  If an Enterprise has an IT system worth protecting, they are (should be) using APM.
In the case of offloading cloud services, one can connect directly to Amazon Web Services (AWS) service using their APIs.  These APIs allow you to collect full performance metrics and APM can tell you what’s happening in the Amazon cloud infrastructure.
The beauty of the latest APM solutions is they can discover applications in the Amazon cloud, analyze the metrics, and automatically trigger actions such as restarting or stopping an Amazon server when certain capacities or thresholds are met.
Automation of cloud provisioning is a great leap forward regardless if it’s outsourced or if you are rolling out your own private cloud.  The whole purpose of cloud and virtual computing is to take advantages of economies of scale. In a similar way, APM’s ability to automate resource provisioning delivers additional “human" economies of scale. Imagine what a burden it would be for a system administrator to login and manually turn resources up or down. And if your cloud environment scales to dozens or hundreds of cloud instances, productivity would suffer.
Building Your Own Private Cloud and Employing Virtual Environments
Leveraging Amazon or another cloud provider’s infrastructure is certainly the fastest way to get into the cloud market. But longer term, most will build their own infrastructure and use software such as VMware, Hyper-V or Citrix XenServer.  
Similar to Amazon cloud control, APM software supports the automated provisioning of virtual machines, with only slight variations. Basically, the user creates a script for a particular “action" in the APM. If the number of active sessions on a Tomcat server shoots up, the APM can initiate the action on VMware to restart a virtual machine. Other capacity thresholds could trigger actions.  CPU utilization going over 80%, memory running low, or disk space coming close to full capacity.
Another nice thing about the latest APM software is it’s completely user configurable. Based on historical trending, you can set capacities to manage and control 50+ out-of-the-box apps and servers in your cloud environment – all from one system console.

Conclusion
We know the systems and networking people grumble about the Cloud.  We know C Level people want more productivity and think the Cloud can be an answer without knowing what it take.  We know end user apps and bandwidth demands will increase.  We also know the cloud is a freight train picking up speed and those who get in game early will learn some valuable lessons, will realize the economies of scale and save money.
Looking back, APM tools have made great progress over the years.  Remember the “self-healing networks" that IBM promoted in the early to mid 2000s? Application Performance Monitoring was a key technology behind that.   And the difference today is that it costs in the tens of thousands of dollars instead of millions.

Whether you leverage Amazon’s cloud, a partner’s cloud, or one you build yourself, an APM solution coupled with automated resource provisioning could save you months of development time and get you into the cloud faster so you can gain experience and beat your rivals to market.  And we don't mind who you are.  ManageEngine software can handle the Telco, ISP, MSP or the Enterprise.

Saturday, July 23, 2011

Net Neutrality and the role of Network Management


Update:  1/8/2015

About this time last year, the US Court of Appeals took down the FCC Net Neutrality order. Now the FCC is looking at monkeying with the rules again.  The FCC claims they want to reclassify ISPs under Title II of the Telecommunications Act, which means the power to regulate them as a public utility, which potentially means the ability to tax them.

Update:  7/20/2012 http://www.humanevents.com/2012/07/15/ron-and-rand-paul-launch-a-crusade-for-internet-freedom/


Congressman Ron Paul (R-Texas) and his son, Sen. Rand Paul (R-Ky.), have always been prominent champions of libertarian philosophy. They have chosen Internet freedom as a new focus for their efforts, publishing a manifesto called “The Technology Revolution.” This crusade will rival Rep. Paul’s long quest to “end the Fed” as a top priority for his Campaign for Liberty organization.

As even the most casual acquaintance with Ron Paul or his organization would confirm, putting anything up there next to the Federal Reserve on his top policy shelf is a pretty big deal. Paul still wants to end the Fed, but he wants to make sure a free Internet survives it.
“The Technology Revolution” is a broadside against Net Neutrality, which the Campaign for Liberty manifesto describes as “government acting as arbiter and enforcer of what it deems to be ‘neutral.’” The architects of Net Neutrality are said to be “masters at hijacking the language of freedom and liberty to disingenuously push for more centralized control.” Terms like openness, Internet freedom, and competition are twisted into euphemisms for government control and the dissipation of property rights. It’s reminiscent of the way “social justice” has become a cover for endless injustice, perpetrated by the government against disfavored groups and individuals.


Update: 1/12/2012 Officials see limited government role in Internet governance

Unrelated to Net Neutrality, but a statement from Assistant Secretary of Commerce for Communications and Information Larry Strickling told an audience at the Brookings Institution that Government will stay out of the way.

"Each challenge to the multi-stakeholder model has implications for Internet governance throughout the world," he said. "When parties ask us to overturn the outcomes of these processes, no matter how well-intentioned the request, they are providing ammunition to other countries who would like to see governments take control of the Internet."

http://www.nextgov.com/nextgov/ng_20120111_1140.php?oref=rss?zone=NGtoday

Note:  Originally written in the Summer 2010.  Since then, the Federal Communications Commission registered its Net neutrality rules with the Office of Management and Budget, which is the next step in making the new regulations official.

In some corners, current discussions surrounding the issue of Net Neutrality have alarming overtones, with people concerned about the possibility of the government stepping in to regulate the Internet. This can be seen as a free market versus regulation issue that could hinder the ability of Service Providers to tunnel content to users and tier services for the pay to play. In short, many people think that regulation will cause the Internet more harm than good. Part of the problem is that terms haven’t really been fully defined, leading to general confusion on many points.  Or should it be defined at all?

Content providers want the access... but speed and choice already exist for most people, and at reasonable costs. Not for all, though.  For example, one of our branch offices in rural NJ, the connection speed could be faster (1.5Meg DSL) and choice of providers is limited. Those living in bigger population centers, however, will have many more options.
   
The Electronic Frontier Foundation (EFF) supports neutrality in practice, but is opposed to open-ended grants of regulatory authority to the Federal Communications Commission (FCC).  They were in the forefront of those applauding the D.C. Circuit Court of Appeals ruling this past April 2009 limiting the FCC's authority to punish Comcast for interfering with its subscribers' use of BitTorrent. While not condoning Comcast's behavior in any way, the EFF thinks that the FCC should most certainly not have regulatory powers broad enough to restrict or regulate the Internet at the Commissioners' whim.

Then the industry starts to jockey for position.  Consider this sequence of events: Vonage VoIP was blocked a while back. Apple iPhone blocks Google Voice, then AT&T tells Apple to remove the block; AT&T allows Skype, Apple announces that the Google CEO will resign his position on Apple's board.  Companies launch opinions on their position, lobbying efforts in DC intensify and back room meetings take place.

Service Provider, Verizon wants to tier services to make more money versus their argument to the build-out investment; Verizon CEO Ivan Seidenberg attacks the idea that carriers should be considered "dumb pipes." This "understates the role of sound network management practices,” he said. 

Federal Communications Commission chairman Julius Genachowski wants to "...ensure that Internet access providers are transparent about the network management practices they implement."

From my vantage point as a network management software vendor, Network Management for Equipment Providers and Communications Service Providers is a closed community. It can be their competitive edge and managed services is big business because this is how companies literally run their businesses. Besides all the network devices and applications, it is the Network Management software that is running the business.  Network and Application Management software processes events and faults in the attempt to proactively maintain network uptime and Quality of Service for their customers.  This software is the tool that they use to look at bandwidth utilization, meaning who is consuming the internet traffic and what it flowing through those pipes, meaning what type of traffic is being transmitted and what applications are taking up the resources. Performance optimization is the name of the game.  Service Providers use this data in a variety of others ways besides the troubleshooting aspect.  It’s used for resource and capacity planning, for billing and accounting reasons and even operationally for network configuration and provisioning.

Whether the company manages and controls their own network or outsources it, they connect to the Service Provider, and network management ensures business continuity.  The Service Provider is best qualified to run their network in terms of high availability and performance and they are able to do this through network management, traffic management and quality of service design.  I really doubt Service Providers will want to comply with the FCC transparent management reporting requirements.

Google is of the opinion that service providers will be tempted to block or restrict the on-ramps. Creativity, innovation and a free and open marketplace are all at stake in this fight.

Telecommunication equipment providers are not in favor of net neutrality regulations, which would apply to both wireless and wireline telecom companies, claiming in essence that a free marketplace is crucial to maintaining an open and innovative Internet.

"If the FCC takes a prescriptive approach to new regulations, then it could place itself in the position of being the final arbiter of what products and services will be allowed on the Internet." in a letter, signed by Cisco Systems, Alcatel-Lucent, Corning, Ericsson, Motorola and Nokia. And that, many people agree, just isn’t a good idea.

FCC Chairman Julius Genachowski wants to adopt rules requiring phone and cable companies to give equal treatment to all broadband traffic.  But all content sites are not equal, some, like streaming audio/video app You Tube or online gamer World of Warcraft are pipe hogs.  At the consumer level or small business level, it is not much of an issue today. FIOS is $50 per month, Optimum Online Ultra is $50 per month which is plenty of bandwidth for most applications.  So we are talking possible implication for medium and large business.  Google being a large business, I suppose they could buy telecom gear and build out or acquire a Service Provider and control their destiny. But that sort of thing has been tried before. There’s a word for it – monopoly.

Content providers want to ensure equal access. What if Google, Amazon, or Facebook were able to be the Top Tier content providers and get the priority traffic routed to them?  Imagine if a Web2.0, Cloud-based content provider, was not part of the top tier tunneled providers. Smaller content players can’t afford to have the big players carve up the Internet space to their advantage. 

Because internet traffic can be routed, analyzed and prioritized, there could be privacy issues.  Service Provider and Content companies are already pushing relevant advertising to users in the name of more revenue.

If you step back and realize that Dial-up and Frame Relay was only 15 years ago.  The internet is young.  Regulation will only aid and abet the censors or politicians and will create two sides of the equation.  One side will win, the other will lose.  The Internet must be free of any regulation.