- In this episode we talk with Peter Leechburch Auwers about services and support. Which types of services there are and how important it is to have these in place to ensure business continuity.
This is very exciting since EmXcore is now expanding the amount of services we are able to provide! So no support yet? Need help designing, implementing or monitoring your network? Check out our brand new ‘Services Menu‘.
Welcome, everyone to another episode of the EmXcore podcast, a podcast will invite customers, partners or anyone of who we think has something interesting to say or has some interesting views on the internet networking industry.
Today we are talking to Peter. We are talking about services support, and why it is so important to have something like that arranged. Welcome, Peter. How’s it going?
Thank you. Yeah, I’m doing fine. Thank you. How are you?
Good. To start off, can you please introduce yourself? Who are you?
Yes, of course. My name is Peter Leechburch Auwers. I’m working in the networking, security and cloud industry for 15 years now. I have a lot of experience in operational roles, management roles, but also engineering roles, specifically in the support and managed services department. So, providing services for the largest integrators and service providers in Europe.
Cool. Sounds good. You’re mainly doing a support and services and those kinds of things. What is it so important about those kinds of services and those kinds of things?
Well, that’s a very interesting and important question. It starts off with businesses who use equipment and the equipment that they use, they use that to support their business. There’s two types of businesses, there’s businesses that have a core business in using networking equipment. For example, those are ISPs, hosting providers, they make their money, they have their business case, surrounded by the equipment and the infrastructure.
And you also have businesses that rely on the infrastructure to support their business, but it is not their main core business. For example, financial institutions or universities, they rely on an infrastructure to provide the education or to provide bank transfers, that’s a digital environment, but they do not make money with the infrastructure. Although they do rely on it. And that is where business continuity comes into play.
Business Continuity is to make sure that your business keeps running in case of emergencies. And when your business relies heavily on the infrastructure, like a bank or a service provider, then it’s really important to assess the risks that you have in your infrastructure and your equipment, routers, switches, firewalls, and other equipment, which are potentially a huge risk in your business. Because if they fail, you can have an outage. And if you can have an outage that can impose a huge financial impact for example, or it has a huge impact on your reputation, because your outage is shown in the news. It gets out on the internet really quick. And it’s very easy for people to judge companies based on the outages. People will talk about it; “they had this issue again”, “the app was not available again.” “It’s a bad company don’t do business with them.” Those are reputational damages.
And to be able to have business continuity and to mitigate those risks, support is really important. The equipment and infrastructure that they use, is very complex. The technology is going really fast, the evolution of technology. And with the evolution things also become more complex. To maintain the level of knowledge that is required to support the infrastructures and the technology that can be really hard to keep up with and also to keep your employees at the level that is required in case of an emergency. And that is when businesses move towards a partner that supports them. If they have issues then that partner can help them, they have the specific knowledge that is required to troubleshoot issues and to help them minimise the impact if something goes wrong.
When we are talking about support and services there is a pretty wide spectrum of things probably that fall into that category right? Can you elaborate a little bit more about that? So, for people who don’t really know support, it sounds like ‘ Okay if something goes wrong, we get help’. But it’s probably more than that, right?
Yeah, that’s true. So, in general terms, support means you get help. But you can get help in several different ways. The equipment, and the infrastructures that are used to support businesses, from start up to small enterprises to the biggest companies in Europe, they all have the infrastructure and it needs to be configured to support the business. So, to be able to put all the settings in the right place, it can be very tricky and difficult. So, assistance with configuration is one item that can be part of support.
Another thing is that if equipment runs into bugs, software bugs, for example. That is something that happens, it is inevitable that software contains some bugs. So, in the end you might end up using a certain feature in a configuration that is hitting a bug, which causes some sort of failure and that will need to be investigated. If it is not the equipment it is more complex. It is not easy to see where the fault is. So, you need assistance in that case and help to be able to troubleshoot what is happening and how you can prevent it in the future. Because that is the most important thing, how can you prevent it happening again in the future. So that is also part of support.
Another thing is hardware. We have a lot of virtual environments nowadays, we have cloud, we have hypervisors to be able to spin up virtual machines, and that’s all virtualised. But also that virtualised environment runs on physical hardware. Everybody talks about the cloud like that are cloud you see in the sky, but it is a computer.
In the end, there is still a physical part behind it off course.
Exactly only depends a little bit on who is the owner of the equipment and who manages the equipment that is the factor of can you take it as a service? Or is it your own equipment, that is basically the difference.
But the hardware can still fail. So, what do you do if the hardware fails, then you also require support. You need to have a spare unit that you can use to replace that unit. But you can also imagine that the equipment that is used nowadays, it is very expensive equipment. And to have a spare part on your shelf in the basement for every single part you use, that can be a very expensive exercise.
So that is also one of the reasons why businesses move towards a support contract with a partner. To make sure that if something happens they can get a replacement within a certain timeframe. And that is based on the service level agreement (SLA).
A lot of times companies have their equipment in different places in the world. So, if you need a spare part in every single data centre that you’re in, then the costs can go up pretty quickly. And also, you see that there’s companies that have spread their risks across multiple data centres within the same region. So, they can be in multiple data centres in Amsterdam, for example, and want to have a single location in Amsterdam with a partner to have their spares in. They can then call that partner that drives to the datacentre with the part, replaces it and restores the service in order to minimise the business impact. Because in the end it’s all about the business impact. How to lower the risks and how to minimise them.
Off course you have a lot of experience in this area. Do you have any examples of what could happen when companies don’t arrange their support or don’t arrange any backup whatsoever and then something happens or stuff breaks down?
Yeah, well there’s numerous examples of what can happen. There are examples in networking, for example, with ISPs that have a route reflector, which is a very difficult mechanism that is used to route traffic on the internet. If that unit fails, it has impact on all the routers in the network. So, we have had instances where one of our customers had issues with the route reflectors, it could be software related or hardware related, or both. And then you got an entire service provider that is down. And if you can imagine a service provider, like one of the biggest ones in the Netherlands or Germany, that means about 2 million people cannot access the internet. Or all the mobile internet or a single provider goes down. Imagine that, that nobody has a cell phone 4g connectivity.
It’s almost difficult to imagine that that happens.
Yeah, but it can still happen, you know. And then it’s important to have the mechanisms in place to restore it quickly. So, the impact depends on which company it is and where it is in a network. But the impact can be significant.
We’ve also seen issues with a hardware failure of a single firewall cluster. So those are two units that are supposed to be redundant, but when the issue is with the complete mechanism it causes a failure anyway. We have seen that all the payment terminals for pin transactions were down, because they were connected to that surface. There are a few companies that provide such a service. And if one of the biggest one fails then it has huge social impact. Because it’s not only that you cannot browse the internet, you don’t have connectivity. But if you cannot pay in a store it gets difficult. Especially if it’s a pin transaction, because nowadays there are not many people that carry around cash money. So, everybody relies on the infrastructures, so they can use Apple Pay or Google Pay or the wallets and other digital payment methods. But if they’re not available, then you’re stuck. And that costs also a lot of money for the companies involved.
And sometimes it comes down to a single mechanism, a single device in a network that fails that can have such an impact. And that’s why it’s so important to have the right risk analysis in place to understand what are the factors that are important in your infrastructure that your business relies on? Assess the impact of what would happen if this would fail? What if our router or switch malfunctions? What if we lose the connectivity in our business? What will be the impact to our business?
That analysis should be made by everybody who owns a business, small or large. And then you can assess the impact of what it would cost you financially and for your reputation. And then basically with the support contract you buy a sort of insurance
Yeah, true. Did it ever occur that you would have an issue and it was so difficult to find what the issue was or how to how to solve it? Or do you always figure it out in the end?
Well, there’s two different parts within support. So basically, you have the restoration of the service. And you have what’s called a root cause analysis. The main priority when there’s a huge incident is to restore the service as soon as possible. Which means that if there is a process that has stopped, you restart it. if there is a device that fails, it’s the same as your computer at home, if it fails you reboot it, you restart your computer. And usually it fixes the problem.
Have you tried turning it off and on again?
Yeah, that’s the first thing to restore the service. But that does not solve your issue. It only restores your service, it doesn’t solve anything. Because it can occur again, you don’t know what happened. So that is why root cause analysis is really important to find out what has happened, and how can we prevent this from happening again?
Yeah. So those are two different elements that are important in support. Restoring service should be done quickly especially if there’s a huge business impact. But before rebooting they should collect the right amount of logs and information to be able to investigate what has happened. Usually support providers they do take a limited amount of time to investigate what has happened before they restore the service, because it can happen that with the restoring you wipe out the evidence. So, it’s vital to take some time in trying to understand what’s going on, investigate what the issue is collect all the necessary information to investigate then restore the service, because that is vital for the business.
And then you move on to the next step, which is root cause analysis. And then sometimes it’s difficult, because you rely on the vendor, you have to raise a support ticket with a vendor and say, “Hey, we got this issue, this is what we’ve seen. And we need help trying to figure out how to prevent this from happening again.” And then they will figure out if this is a known issue. Sometimes there are already known issues that you have hit like a known bug. And sometimes it’s something new and they say, “Hey this is interesting. We’ve never seen this before.” And then they’re going to investigate it. And then they’re going to adjust their software. And provide an updated version of the software in the future, which should solve or prevent that specific issue from happening again.
Well that’s interesting as well, because of course if you buy a new piece of hardware most of the time you will go for support through a vendor. But what we do, of course, we sell refurbished network equipment mainly. So sometimes it is end of life, so then you cannot get any support from the vendor anymore. Are there ways of solving those kinds of issues, even if you cannot have the contact with a vendor, and you don’t have vendor support anymore? But you can off course still run into hardware failure, software bugs or anything like that?
Yeah, well, it’s funny that you say that, because the process of support is two ways, right. And so, first thing is to restore the service, which is actually the most important for the business. And then the root cause analysis, where you might need the vendor, is the second step, which is also very important. But the first step, restoring your service, get your business up and running again, that is the main priority in every single case.
And there are situations in which you can do with knowledgeable people without using a vendor. So, you can easily use end of life equipment, which is five year old, two years old, 10 years old. Usually it’s not that the equipment is bad. Usually, it’s because the vendor has introduced a new model, which they want to make money on. So, in the end the older models are declared end of life. And then the support stops at a certain point because they want to invest the resources into the new equipment and not in the old. That’s part of their business model.
But it doesn’t say the end of life equipment is bad. It was new once, it was good once so it’s still good. If you can still use that there is no argument of not using it. So, the main difference is that you cannot rely on a vendor anymore to provide you with software updates and upgrades. But if you have the benefit of using equipment that is there for a long time, most of the bugs are already addressed. Most of the, let’s say, child diseases of equipment are already solved and investigated in the past. So, if you can get your hands on the latest software available for your machines, you can still run it for as long as you need. And then you need a partner who is able to support you if issues occur to restore services. And that can easily be done without a vendor, you only need a partner who is was capable and knowledgeable and has the right tools to help.
And it’s of course, it’s also not always software related bugs, it can still be hardware.
Yeah. And also end of life hardware, it can be replaced easily. If you have a power supply that is running for five years, or 10 years, in the end it might fail due to dust collecting or something. So even if you don’t have a vendor support contract, you can still have a partner that can fulfil your spare parts. And also replace it for you in a datacentre where you are not located.
So, if you’re a company based in London, and you have equipment in both datacentres in London, and Amsterdam and Frankfurt, you can use a partner that is able to replace parts in all three locations, support you in that way. So that’s still possible, you don’t always need the vendor.
But there are occasions when for example certain financial institutions have to comply with certain regulations. They have to comply with certain standards that the government has issued. And then they are obliged to have some equipment that is under vendor support mainly for the software. Usually the software has a much longer extension on being supported than you think. There are vendors like for example juniper who have software which runs on different types of equipment. Those packages are introduced at some point in time, and they are supported for a certain amount of time. But it is very usual that the software is still supported, even though the equipment has been declared and of life. Because that same software is running on the new equipment. So then as a business you still comply with the requirements from the government while your hardware might be end of life.
So, we’ve been talking mainly about why you need support if things go wrong. But sometimes you might need help as well building the network or making sure that it’s maintained or knowing what will happen when it blows up. Is that also something that I should think about in terms of services?
Yeah, well usually support is a service that is reactive. Customers have bought equipment, they’ve used it in the network and if something goes wrong, they have a phone number to call for help.
But there are also different types of services, professional services, for example, those services are mainly used at the moment you buy the equipment. You don’t only buy the equipment, you can also get a knowledgeable person who can design your network with business continuity in mind. To make sure that things are redundant in such a way that it does not impact your business if something goes wrong with this device. You need experience and knowledge about the equipment and about design to be able to design that in such a way. So those are professional services, you can see it as a design, build configuration and implementation services.
Next to support services, which are reactive, you also have services that are more proactive. And proactive services usually fall in the category of managed services. Managed services are interesting because a partner can help you monitor your environment with systems and set the right thresholds to monitor the equipment for potential upcoming failure. Usually, you see it in the logs, you see certain elements of messages from a device before it fails. So, you can get clues and directions of what might happen on the network.
So, if you have a managed service partner he can look at your network and make sure that it does proactive maintenance. Clears the logs on time so this won’t fill up for example, you can restart certain services to prevent memory leaks to prevent issues in the future. You can also think about rerouting traffic if your internet utilisation is too much, and you have multiple connections, they can spread the load a little bit. So, there’s a lot of things that you can think about that the managed service can do for you proactively to prevent failures instead of reacting to a failure that has already happened.
Also, most of the partners use dedicated tooling that is specialised in monitoring these types of equipment. And these proactive services help you to make sure that you don’t have to do it yourself. If something is about to go wrong they have already seen it. So, they can already alert the support team that there is an issue, that there was an event that requires attention.
You as a customer usually only experience the consequences of a failure. It’s not that you necessarily see the failure. You experience a loss of connectivity. You are sitting behind your laptop, but the internet doesn’t work. Where do you look first? Do you go to the server cabinet? Are you going to look at your switches or wiring? Or are you going to look into the firewall? Is there an issue with your ISP, for example? Does everybody has an issue just one company? Those are things that can be very stressful. If you have an important business meeting and you lose connectivity, that can be a problem.
But if you have a partner that runs a managed service which proactively looks into your network environment to make sure that if something goes down, they can immediately tackle it. Most probably they will call you and say hey, we’ve spotted an issue in your environment and currently we are working on it to solve it. Even when you might not be even experiencing an issue (yet). But they inform you we have seen issue and are preventing certain outage.
yeah. And if you can prevent it, it’s always better than having to solve it when it’s already an issue of course.
Yeah, so that’s also why people bring their car to a garage for service. It’s a proactive maintenance to prevent that if you go on holiday, you’re stuck alongside the road with your car full of luggage stranded somewhere. And the same goes for your infrastructure. If that is important to your business. You have to do proactive maintenance and a managed service can do that.
Well thank you so much Peter for joining the podcast
And for everyone listening to this podcast thank you for listening. Let us know your thoughts or if you have any suggestions for the next episode. Hopefully until the next one!
Listen and Subscribe to our Podcast