Queue traffic shadowing in service of refactoring

Andrzej Deryło - January 25, 2021

Queue traffic shadowing in service of refactoring

One of our customers has recently come up with updated requirements to one of the major flows in a system that we were developing. The requirements involved letting tenants switch on/off some parts of if and parameterize others. The remaining ones were: limit bugs as much as possible with a very optimistic (yet unrealistic) assumption “no bugs at all” with default settings, the second version of the flow must give results as similar as to the first version of the flow as possible limit system downtime to a minimum keep the development pace of another team intact

The technical requirements entailed a lot of changes in various areas of the code, such as:

  • retrieving data for the process
  • initial selection and preparation of the data
  • actual processing of the data

Considering all the above, this seemingly simple task of adding a couple of “ifs” here and there grew in size to a pretty big refactoring task with analysis of refactoring results. All that with no downtime, slowing down other teams and all the other stuff mentioned above. A tough case indeed.

Traffic shadowing

Traffic shadowing is a technique which allows you to test new features (or even entire applications) using production traffic – before actually releasing it to production. It can be achieved by copying part (or entire) traffic from the usual production path to a tested application or feature.

Traffic shadowing example


On the face of it, the idea may sound simple, but implementing it takes a lot of careful planning. Here are the most important things to consider:

  • Getting traffic to test clusters without impacting critical path
  • Annotating traffic as shadowed traffic
  • Compare live service traffic with test cluster after shadowing
  • Stubbing out collaborating services for certain test profiles
  • Synthetic transactions
  • Virtualizing the test-cluster’s database
  • Materializing the test-cluster’s database

The System

The system we developed consisted of a few microservices. The purpose of refactoring was to change logic of one microservice triggered by queue messages. All messages are produced by other containers and are aggregated in a single queue. After that, messages are distributed across priority queues. From priority queues, messages are directed to the actual working queue according to their priority, and the final result of a single message is a row in the SQL database.

A graphical representation of the system we developed. Red arrows represent the flow we wanted to alter. For clarity, the diagram shows only a part of the system


To redirect traffic to the alternative flow, we created a separate queue, added consumers to that queue and added a switch letting us control how many messages were redirected to the alternative flow – expressed in percentage. It allowed us to turn on or off the alternative flow and manage the load on it.

To store the results of the alternative flow we decided to add a new table with the same layout as the production one, in the same database. This allowed us to easily compare the results of production processing with the altered flow without setting up and mocking up data in a separate database – saving us a lot of time and headaches.

At the code level we added a separate project to contain altered logic, necessary abstractions and tests. We kept input parameters the same between V1 and V2 as there was no point in changing them. Adding a separate project and keeping input the same let us easily switch from the old flow to the new one – all it took was deleting a project containing the old logic and references to it and pointing an altered consumer to the production queue.

Graphical representation of the altered system


At that point, we had three major aspects covered. What about deployment?

The system was deployed on three different environments:

  • test - where we had regression & integration tests of majors flow on a real resources and anonymized data
  • dev - where we were performing deployments of current work for customer to check
  • prod - where actual production traffic happened.

All of these environments were isolated from each other, and there was no communication between them. Real data was only kept on production, dev was an anonymized backup of production data utilized just as an environment where the customer could go and check if the developed features worked as expected. The test environment was completely automated and operating on a one-day snapshot of anonymized production data to check if major flow works the same as it used to.

In our case there was no separate container which required traffic redirection, it was just an alternative flow inside the application, so problems with deployment of test cluster and database virtualization did not apply – everything was deployed without any modification.

The work

As we had addressed the major problems behind traffic shadowing, we created tasks, implemented them and checked for the most common bugs and mistakes – which took us about a week. At the end of the sprint we deployed the refactored flow to production, turned on redirection to the alternative flow, and resumed our usual work.

After a couple days we went back to investigate the results – the data has been redirected and processed as expected. After further investigation, we found that there were discrepancies between V1 and V2 results. We tracked and fixed the bugs, deployed them to production. After removing the results of the previous processing, we turned redirection again.

We were repeating these steps until the customer was satisfied with the outcome of the refactored flow.

All of the above actions were taken in parallel with normal development of the system and in the production environment. We encountered literally zero downtime, and were able to compare and test the whole feature without disrupting the actual production processing.

When the customer gave us green light to replace V1 process with V2 process, it took us two business days to remove V1 related code and store results from V2 into the usual production table. During these two days we also cleaned the code and removed all temporary code and database structures which were required only for traffic shadowing itself. We also adjusted our regression and integration tests to conform with the results of the process – slightly different yet accepted by the customer.

Graphical representation of system after removing old code

Conclusion

Refactoring with traffic shadowing yielded great results for us. It allowed us to fully meet our customer’s requirements related to downtime. Apart from that, we were also able to constantly consult the outcome with the customer, and based on the feedback improved the code to the customer’s complete satisfaction.

From the developer’s perspective, we were able to quickly implement requested changes, safely deploy them to the production environment and observe the outcome. With the ability to control the load on the alternative flow, we could investigate the performance of the new solution. At the same time, we were not slowing down the other teams’ development processes – even if we made a mistake causing bad results, we could quietly discuss what went wrong and, having resolved the problem, deploy appropriate fixes independently.

In our opinion, traffic shadowing is a very useful and powerful technique which can give your customers and developers working with you a new level of security. It might sometimes be challenging to implement but if implemented properly, it is worth the struggle.

About the author

Andrzej Deryło

Software Engineer

Andrzej is an experienced C# software engineer focused on Microsoft technologies. He acts as a system architect, designing and developing web applications. His expertise includes creating software running on vast amounts of data in the cloud.
Andrzej's goal is to develop solutions that will provide meaningful insights for business and consumers alike; he's always looking for new ways to use technology to make people's lives easier. He likes to play around with IoT solutions in his spare time.

Tempted to work
on something
as creative?

That’s all we do.

join nexocode

Find us on

Need help with implementing AI in your business?

Let's talk blue circle

This site uses cookies for analytical purposes.

Accept Privacy Policy

In the interests of your safety and to implement the principle of lawful, reliable and transparent processing of your personal data when using our services, we developed this document called the Privacy Policy. This document regulates the processing and protection of Users’ personal data in connection with their use of the Website and has been prepared by Nexocode.

To ensure the protection of Users' personal data, Nexocode applies appropriate organizational and technical solutions to prevent privacy breaches. Nexocode implements measures to ensure security at the level which ensures compliance with applicable Polish and European laws such as:

  1. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (published in the Official Journal of the European Union L 119, p 1); Act of 10 May 2018 on personal data protection (published in the Journal of Laws of 2018, item 1000);
  2. Act of 18 July 2002 on providing services by electronic means;
  3. Telecommunications Law of 16 July 2004.

The Website is secured by the SSL protocol, which provides secure data transmission on the Internet.

1. Definitions

  1. User – a person that uses the Website, i.e. a natural person with full legal capacity, a legal person, or an organizational unit which is not a legal person to which specific provisions grant legal capacity.
  2. Nexocode – NEXOCODE sp. z o.o. with its registered office in Kraków, ul. Generała Henryka Kamieńskiego 51, 30-644 Kraków, entered into the Register of Entrepreneurs of the National Court Register kept by the District Court for Kraków-Śródmieście in Kraków, 11th Commercial Department of the National Court Register, under the KRS number: 0000686992, NIP: 6762533324.
  3. Website – website run by Nexocode, at the URL: nexocode.com whose content is available to authorized persons.
  4. Cookies – small files saved by the server on the User's computer, which the server can read when when the website is accessed from the computer.
  5. SSL protocol – a special standard for transmitting data on the Internet which unlike ordinary methods of data transmission encrypts data transmission.
  6. System log – the information that the User's computer transmits to the server which may contain various data (e.g. the user’s IP number), allowing to determine the approximate location where the connection came from.
  7. IP address – individual number which is usually assigned to every computer connected to the Internet. The IP number can be permanently associated with the computer (static) or assigned to a given connection (dynamic).
  8. GDPR – Regulation 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of individuals regarding the processing of personal data and onthe free transmission of such data, repealing Directive 95/46 / EC (General Data Protection Regulation).
  9. Personal data – information about an identified or identifiable natural person ("data subject"). An identifiable natural person is a person who can be directly or indirectly identified, in particular on the basis of identifiers such as name, identification number, location data, online identifiers or one or more specific factors determining the physical, physiological, genetic, mental, economic, cultural or social identity of a natural person.
  10. Processing – any operations performed on personal data, such as collecting, recording, storing, developing, modifying, sharing, and deleting, especially when performed in IT systems.

2. Cookies

The Website is secured by the SSL protocol, which provides secure data transmission on the Internet. The Website, in accordance with art. 173 of the Telecommunications Act of 16 July 2004 of the Republic of Poland, uses Cookies, i.e. data, in particular text files, stored on the User's end device.
Cookies are used to:

  1. improve user experience and facilitate navigation on the site;
  2. help to identify returning Users who access the website using the device on which Cookies were saved;
  3. creating statistics which help to understand how the Users use websites, which allows to improve their structure and content;
  4. adjusting the content of the Website pages to specific User’s preferences and optimizing the websites website experience to the each User's individual needs.

Cookies usually contain the name of the website from which they originate, their storage time on the end device and a unique number. On our Website, we use the following types of Cookies:

  • "Session" – cookie files stored on the User's end device until the Uses logs out, leaves the website or turns off the web browser;
  • "Persistent" – cookie files stored on the User's end device for the time specified in the Cookie file parameters or until they are deleted by the User;
  • "Performance" – cookies used specifically for gathering data on how visitors use a website to measure the performance of a website;
  • "Strictly necessary" – essential for browsing the website and using its features, such as accessing secure areas of the site;
  • "Functional" – cookies enabling remembering the settings selected by the User and personalizing the User interface;
  • "First-party" – cookies stored by the Website;
  • "Third-party" – cookies derived from a website other than the Website;
  • "Facebook cookies" – You should read Facebook cookies policy: https://www.facebook.com/policy/cookies
  • "Other Google cookies" – Refer to Google cookie policy: www.google.com/policies/technologies/types/

3. How System Logs work on the Website

User's activity on the Website, including the User’s Personal Data, is recorded in System Logs. The information collected in the Logs is processed primarily for purposes related to the provision of services, i.e. for the purposes of:

  • analytics – to improve the quality of services provided by us as part of the Website and adapt its functionalities to the needs of the Users. The legal basis for processing in this case is the legitimate interest of Nexocode consisting in analyzing Users' activities and their preferences;
  • fraud detection, identification and countering threats to stability and correct operation of the Website.

4. Cookie mechanism on the Website

Our site uses basic cookies that facilitate the use of its resources. Cookies contain useful information and are stored on the User's computer – our server can read them when connecting to this computer again. Most web browsers allow cookies to be stored on the User's end device by default. Each User can change their Cookie settings in the web browser settings menu: Google ChromeOpen the menu (click the three-dot icon in the upper right corner), Settings > Advanced. In the "Privacy and security" section, click the Content Settings button. In the "Cookies and site date" section you can change the following Cookie settings:

  • Deleting cookies,
  • Blocking cookies by default,
  • Default permission for cookies,
  • Saving Cookies and website data by default and clearing them when the browser is closed,
  • Specifying exceptions for Cookies for specific websites or domains

Internet Explorer 6.0 and 7.0
From the browser menu (upper right corner): Tools > Internet Options > Privacy, click the Sites button. Use the slider to set the desired level, confirm the change with the OK button.

Mozilla Firefox
browser menu: Tools > Options > Privacy and security. Activate the “Custom” field. From there, you can check a relevant field to decide whether or not to accept cookies.

Opera
Open the browser’s settings menu: Go to the Advanced section > Site Settings > Cookies and site data. From there, adjust the setting: Allow sites to save and read cookie data

Safari
In the Safari drop-down menu, select Preferences and click the Security icon.From there, select the desired security level in the "Accept cookies" area.

Disabling Cookies in your browser does not deprive you of access to the resources of the Website. Web browsers, by default, allow storing Cookies on the User's end device. Website Users can freely adjust cookie settings. The web browser allows you to delete cookies. It is also possible to automatically block cookies. Detailed information on this subject is provided in the help or documentation of the specific web browser used by the User. The User can decide not to receive Cookies by changing browser settings. However, disabling Cookies necessary for authentication, security or remembering User preferences may impact user experience, or even make the Website unusable.

5. Additional information

External links may be placed on the Website enabling Users to directly reach other website. Also, while using the Website, cookies may also be placed on the User’s device from other entities, in particular from third parties such as Google, in order to enable the use the functionalities of the Website integrated with these third parties. Each of such providers sets out the rules for the use of cookies in their privacy policy, so for security reasons we recommend that you read the privacy policy document before using these pages. We reserve the right to change this privacy policy at any time by publishing an updated version on our Website. After making the change, the privacy policy will be published on the page with a new date. For more information on the conditions of providing services, in particular the rules of using the Website, contracting, as well as the conditions of accessing content and using the Website, please refer to the the Website’s Terms and Conditions.

Nexocode Team