Recursive JSON Masking: Handling Arrays Effectively

by Felix Dubois 52 views

Hey guys! Ever found yourself in a situation where you need to mask sensitive data within a JSON object, but the structure is complex, and arrays are giving you a hard time? You're not alone! Today, we're diving deep into the world of recursive masking in JSON, focusing specifically on how to handle arrays effectively. We'll break down a common problem, explore why it happens, and provide a robust solution to ensure your data masking is on point. Let's get started!

The Challenge: Masking Arrays in JSON

So, the main challenge we're tackling today revolves around masking data within JSON structures, especially when those structures contain arrays. Imagine you have a JSON object with sensitive information, like customer addresses, and you need to mask certain fields to protect privacy. Simple enough, right? But what happens when the field you want to mask is an array of strings? That's where things can get a bit tricky.

Let's illustrate this with an example. Suppose you have the following JSON object:

const inputJson = {
 car: "Mazda",
 addressLines: [
 "Mask me!",
 "Mask me 2!"
 ],
};

And you want to mask the addressLines field. You might try using a masking configuration like this:

const fieldsToMask = ["*addressLines"]; // Specify the field with a '*' at the beginning and NO dot(.) anywhere else in that field.

const jsonMaskConfig = {
 genericStrings: [
 {
 config: {
 maskWith: "*",
 maskAll: true
 },
 fields: fieldsToMask
 },
 ],
};

However, the result might not be what you expect. Instead of masking the individual strings within the addressLines array, you might find that the array remains unchanged:

{
 car: 'Mazda',
 addressLines: [
 "Mask me!",
 "Mask me 2!"
 ]
}

This is because the masking logic might not be recursively traversing the array elements to apply the masking. This is super frustrating, especially when you're dealing with nested data structures and you need a reliable way to ensure everything is masked correctly. The goal is to achieve the following:

{
 car: 'Mazda',
 addressLines: [
 "******",
 "********"
 ]
}

Why Does This Happen?

The reason this happens often boils down to how the masking function is implemented. Many masking libraries or custom functions are designed to handle simple key-value pairs. When they encounter an array, they might not automatically iterate through each element and apply the masking. Instead, they might treat the array as a single entity, which doesn't match the desired behavior of masking each string within the array.

Another common pitfall is relying on specific field paths (e.g., addressLines.*) when you don't know the exact structure of the JSON. This approach works well for known structures but falls apart when dealing with dynamic or unpredictable JSON formats. Imagine if your JSON had nested arrays within arrays – specifying every possible path becomes a maintenance nightmare!

The Need for a Recursive Solution

To effectively mask arrays within JSON, a recursive approach is essential. A recursive function is one that calls itself during its execution. In the context of JSON masking, this means the function should:

  1. Check if the current value is an object or an array.
  2. If it's an object, iterate through its keys and recursively call the masking function on each value.
  3. If it's an array, iterate through its elements and recursively call the masking function on each element.
  4. If it's a simple value (like a string or number), apply the masking logic if the field matches the masking configuration.

By using recursion, you can traverse the entire JSON structure, no matter how deeply nested, and ensure that the masking rules are applied consistently.

Crafting a Recursive Masking Function

Alright, let's get our hands dirty and build a recursive masking function that can handle arrays like a champ. We'll break down the process into manageable steps and provide code examples to illustrate each concept. This is where the magic happens, so pay close attention!

Step 1: Setting Up the Masking Logic

First, we need to define the core masking logic. This involves creating a function that takes a value and a masking configuration as input and returns the masked value. The masking configuration will specify how to mask the data (e.g., using asterisks, replacing with a fixed string, etc.).

Here's a basic example of a masking function:

function maskString(value, maskWith = '*', maskAll = true) {
 if (maskAll) {
 return '*'.repeat(value.length);
 }
 // Add more sophisticated masking logic here if needed
 return value;
}

This function takes a string value, a maskWith character (defaulting to *), and a maskAll flag. If maskAll is true, it replaces the entire string with asterisks. You can extend this function to support more complex masking scenarios, such as masking only a portion of the string or using different masking characters based on the data type.

Step 2: Building the Recursive Function

Now, let's create the recursive function that will traverse the JSON structure and apply the masking logic. This function will take the JSON object, the masking configuration, and the current key as input.

Here's the basic structure of the recursive function:

function maskJSONRecursive(json, jsonMaskConfig, key = null) {
 if (typeof json === 'string') {
 return maskString(json, jsonMaskConfig.genericStrings[0].config.maskWith, jsonMaskConfig.genericStrings[0].config.maskAll);
 }
 if (Array.isArray(json)) {
 return json.map(item => maskJSONRecursive(item, jsonMaskConfig, key));
 }
 if (typeof json === 'object' && json !== null) {
 const maskedObject = {};
 for (const k in json) {
 maskedObject[k] = maskJSONRecursive(json[k], jsonMaskConfig, k);
 }
 return maskedObject;
 }
 return json;
}

Let's break down what's happening here:

  • Base Case (String): If the json is a string, we apply the maskString function to mask it.
  • Array Handling: If the json is an array, we use the map function to iterate over each element and recursively call maskJSONRecursive on each element. This ensures that each item in the array is processed.
  • Object Handling: If the json is an object, we iterate over its keys. For each key, we recursively call maskJSONRecursive on the corresponding value and store the result in a new maskedObject. This ensures that nested objects are also processed.
  • Other Types: If the json is not a string, array, or object (e.g., a number or boolean), we simply return it as is.

Step 3: Integrating the Masking Configuration

We need to integrate the masking configuration into our recursive function. This involves checking if a field should be masked based on the fieldsToMask array in the configuration.

Here's how we can modify the maskJSONRecursive function to incorporate the masking configuration:

function maskJSONRecursive(json, jsonMaskConfig, key = null) {
 if (typeof json === 'string' && jsonMaskConfig.genericStrings[0].fields.includes('*' + key)) {
 return maskString(json, jsonMaskConfig.genericStrings[0].config.maskWith, jsonMaskConfig.genericStrings[0].config.maskAll);
 }
 if (Array.isArray(json)) {
 return json.map(item => maskJSONRecursive(item, jsonMaskConfig, key));
 }
 if (typeof json === 'object' && json !== null) {
 const maskedObject = {};
 for (const k in json) {
 maskedObject[k] = maskJSONRecursive(json[k], jsonMaskConfig, k);
 }
 return maskedObject;
 }
 return json;
}

We've added a check to see if the current key is in the fieldsToMask array. If it is, we apply the maskString function. The fieldsToMask array contains fields prefixed with a * to indicate they should be masked.

Step 4: Putting It All Together

Now, let's put everything together and see our recursive masking function in action.

Here's the complete code:

function maskString(value, maskWith = '*', maskAll = true) {
 if (maskAll) {
 return maskWith.repeat(value.length);
 }
 // Add more sophisticated masking logic here if needed
 return value;
}

function maskJSONRecursive(json, jsonMaskConfig, key = null) {
 if (typeof json === 'string' && key && jsonMaskConfig.genericStrings[0].fields.includes('*' + key)) {
 return maskString(json, jsonMaskConfig.genericStrings[0].config.maskWith, jsonMaskConfig.genericStrings[0].config.maskAll);
 }
 if (Array.isArray(json)) {
 return json.map(item => maskJSONRecursive(item, jsonMaskConfig, key));
 }
 if (typeof json === 'object' && json !== null) {
 const maskedObject = {};
 for (const k in json) {
 maskedObject[k] = maskJSONRecursive(json[k], jsonMaskConfig, k);
 }
 return maskedObject;
 }
 return json;
}

const inputJson = {
 car: "Mazda",
 addressLines: [
 "Mask me!",
 "Mask me 2!"
 ],
};

const fieldsToMask = ["*addressLines"];

const jsonMaskConfig = {
 genericStrings: [
 {
 config: {
 maskWith: "*",
 maskAll: true
 },
 fields: fieldsToMask
 },
 ],
};

const maskedOutput = maskJSONRecursive(inputJson, jsonMaskConfig);
console.log(maskedOutput);

When you run this code, you'll get the expected output:

{
 car: 'Mazda',
 addressLines: [
 "******",
 "********"
 ]
}

Woohoo! Our recursive masking function successfully masked the strings within the addressLines array. This is a huge win! 🎉

Advanced Scenarios and Considerations

Okay, so we've nailed the basics of recursive masking for arrays in JSON. But what about more complex scenarios? Let's explore some advanced considerations and techniques to make our masking even more robust.

Handling Nested Arrays and Objects

The beauty of recursion is that it naturally handles nested structures. Our maskJSONRecursive function will automatically traverse nested arrays and objects, applying the masking logic at each level. This means you don't need to write special code to handle deeply nested data – the recursive nature of the function takes care of it!

For example, consider this JSON:

const nestedJson = {
 person: {
 name: "John Doe",
 addresses: [
 {
 street: "123 Main St",
 city: "Anytown"
 },
 {
 street: "456 Oak Ave",
 city: "Someplace"
 }
 ]
 }
};

const nestedFieldsToMask = ["*street"];

const nestedJsonMaskConfig = {
 genericStrings: [
 {
 config: {
 maskWith: "*",
 maskAll: true
 },
 fields: nestedFieldsToMask
 },
 ],
};

const maskedNestedJson = maskJSONRecursive(nestedJson, nestedJsonMaskConfig);
console.log(maskedNestedJson);

The maskJSONRecursive function will correctly mask the street fields within the nested addresses array, giving you:

{
 person: {
 name: "John Doe",
 addresses: [
 {
 street: "********",
 city: "Anytown"
 },
 {
 street: "********",
 city: "Someplace"
 }
 ]
 }
}

Selective Masking Based on Conditions

Sometimes, you might want to mask data selectively based on certain conditions. For example, you might want to mask a field only if it contains a specific value or matches a certain pattern. To achieve this, you can add conditional logic within the maskJSONRecursive function.

Here's an example of how to mask a field only if it contains the word "sensitive":

function maskJSONRecursive(json, jsonMaskConfig, key = null) {
 if (typeof json === 'string' &&
 key &&
 jsonMaskConfig.genericStrings[0].fields.includes('*' + key) &&
 json.includes("sensitive") // Add conditional check here
 ) {
 return maskString(json, jsonMaskConfig.genericStrings[0].config.maskWith, jsonMaskConfig.genericStrings[0].config.maskAll);
 }
 if (Array.isArray(json)) {
 return json.map(item => maskJSONRecursive(item, jsonMaskConfig, key));
 }
 if (typeof json === 'object' && json !== null) {
 const maskedObject = {};
 for (const k in json) {
 maskedObject[k] = maskJSONRecursive(json[k], jsonMaskConfig, k);
 }
 return maskedObject;
 }
 return json;
}

In this example, we've added a check `json.includes(