CODA ML-MDM: Master Data Management using NLP and Machine-Learning


All ERPs and Supply-Chain Management solutions critically rely on coding-schemes for Materials, Suppliers etc. to ensure that transactions are recorded and maintained accurately. However, when there are discrepancies in the coding-schemes such as presence of duplicate material codes, the information systems and physical inventories start to diverge, resulting in redundant inventories, overstocking issues etc. Periodic Auditing has been the corrective measure strategy for reconciling the physical inventory and information systems.

However for large enterprises, the periodic auditing and reconciliation process is a time-and-effort consuming exercise and is not a feasible exercise. Master Data Management (MDM) solutions had emerged and evolved to address this issue, and the emergence of Machine-Learning and Natural Language Processing technologies have played a very constructive role in addressing the Master Data Management.

MDM and Data-Quality Management:

Master Data Management (MDM) is the technology that has evolved to address the inefficiencies in managing large volumes of key-identifiers like material codes, customer codes etc. The efficiency of an ERP system and its modules like Material management Module critically depends on the effectiveness of the MDM.

If the MDM allows multiple material-codes for a single physical material, the information system redundancy leads directly to material redundancy related inefficiencies like excess inventory or unused inventory. The MDM tools should not only ensure material code consistency but also other data-quality measures like material code completeness. For example, MDM tools should not allow material-codes without manufacturer information, insufficient product attributes etc.

CODA ML-MDM Platform Technology Architecture

Recent technology advancements in Artificial Intelligence, Machine Learning and Natural Language Processing has provided the opportunity to innovate and improvise the MDM platform to ensure a quantum leap in data-quality management. CODA ML-based MDM solution is a cutting-edge technology platform which leverages these technology advancements to bring new features that improve the data-quality management workflow process which was performed human material domain experts.

CODA ML leverages advanced word-vector based similarity comparison techniques to identify whether two material descriptions are describing the same item and thereby easily identifies duplicate material descriptions. CODA ML-MDM provides many additional tools for the Material Experts to improve their productivity in the material code verification process.

CODA ML-MDM integrates many ML and NLP technology concepts in achieving this functionality. The core sub-components which are involved in the ML based Material Code verification process are as follows:

  1. Extraction Parser
    When a material code is given, the extraction parser identifies the functional categories of each code segment words such as Nouns, Modifiers, Attributes, Values, UOMs, Part-Numbers, Equipments, Parts and Accessories. The Extraction Parser can correctly identify the type of the code-segments even if there are abbreviations, spelling-mistakes, scrambling etc. To achieve high level of accuracy, the extraction parser uses Multi-level Parsing which integrates top down word-structure based parsing and a bottom-up parsing based on labelling and grouping etc.
  2. Validation Module
    CODA ML-MDM leverages voluminous data-catalogs of different material attributes to validate UOMs, suppliers, part-numbers etc. and ensure that these values are accurately description the material without inconsistency.
  3. Material Code Structure Standardization and Re-writing Module of CODA ML-MDM can reorganize the sub-words of material code into a pre-defined organizational level coding-scheme to ensure consistency in material-coding.
  4. CODA ML-MDM includes MetaData Repositories for Nouns, Modifiers, Attributes, Values, Suppliers, Part numbers etc. which have been indexed and curated after crawling through the web, manufacturer sites etc. and experts-curated, industry standard repositories like UNSPSC etc.
  5. CODA ML-MDM performs several analytics on the material codes such as Equipment-Part-Accessory Analysis for identifying the exact Material indicated in the code before declaring the material code to be duplicate.

Business Applications of CODA ML-MDM

The Key Business benefit of CODA ML-MDM is the Material Data-Quality Management and optimization of the Material Code De-Duplication Process. CODA ML-MDM provides significant improvements in all the sub-processes of the Material Data-Quality management. By improving the efficiency of all the Material Data Quality Management Sub-processes, CODA ML-MDM adds better accuracy with reduced time and effort in the entire lifecycle of the MetaData Management

  1. ML-Search and Duplicate Identification
    CODA ML-MDM tool has a word-vector representation based similarity search algorithm to match material codes, and a domain-specific material-similarity scoring tool to identify whether two material codes are same or different. The vector based similarity algorithm is scalable and can compare thousands of material codes in seconds. By leveraging this algorithm. CODA-ML can identify duplicates within the existing material code dataset within seconds, reducing the time and effort needed to analyze the material codes manually.
  2. Health Checkup
    CODA ML-MDM also has a material code completeness checking algorithm which can analyze a material code and identify the nouns, modifiers, attributes, values, manufacturers etc. within the material code and PO-Text. The algorithm can also identify which are the key information which are missing or inconsistent in a material code. CODA ML-MDM leverages this algorithm to provide an automatic Health Checkup for a Material Code Data set.
    CODA ML-MDM Health Checkup module can analyze a customer’s material code data set and provide data-quality metrics such as the percentage of material-code which have complete attribute-value information, material codes with insufficient supplier information, material codes inconsistent with PO-text material codes without part-number or reference identifiers etc. Using this tool, customers can also estimate the effort needed to improve the data-quality of their material code datasets.
  3. Material Code-Standardization Process:
    Most organizations put in coding-standards and policies for creating the material codes. However due to execution inefficiencies and non-availability of enforcement systems, discrepancies in material code starts to increase, inducing more and more duplicate material codes into the ERP system.
    CODA ML-MDM tool can auto-analyze each of the material-codes in the dataset and can auto-rewrite the material codes in a standardized format specified by the organization. The CODA ML-MDM auto-rewrite tool can improve the data-quality health of the material master-data within a short duration, thereby avoiding costly periodic audits of the material dataset.

CODA ML-MDM and Prosol: AI-Augmented Real-Time De-Duplication Process Management

Even though AI, ML and NLP tools are adding significant value in the data quality management process and material code verification, human experts are still needed to verify the validity of new material codes in newly emerging product categories. Human Experts are also needed to cross-verify whether the difference between two material codes is significant enough to warrant new material-code creation or whether the difference can be ignored and existing material code itself can be used for the new material. Hence, ML-MDMs are optimal when they are integrated seamlessly with existing systems such as SAP MDM tools.

CODA ML-MDM platform can be integrated along the MDM tools already present in SAP ERP system such that every material code can be verified first by the ML-Algorithms and then by CODA’s human experts so that the MDM’s data-quality is ensured to be consistent at all-times, and without any delay to the actual users of the ERP systems. CODA’s PROSOL is the business process-management tool which can automate this process. The CODA ML-MDM integrated process management tool acts an AI-augmented real-time material code de-duplication process wherein the AI algorithms augment the human experts by improving the speed at which the material codes can be verified.

CODA ML-MDM Business Benefits and Advantages

The following are the key business benefits which an organization can gain through adoption of CODA-ML-MDM and CODA-PROSOL Platform.

  1. Enforcing Data-Quality Real-time
    Instead of having to do a periodic data-audits which can cost both time and effort, CODA-PROSOL platform can help organizations maintain their material data-quality in real-time, and also ensure that the physical inventories and material information in the ERP systems are synchronized at all points of time.
  2. Avoid Periodic Audits for Materials and IT systems
    By ensuring the data-quality in the MDM, CODA ML-MDM helps enterprises avoid the costly periodic audits by external service providers to align the IT systems and Physical Inventories.
  3. Save redundant inventory cost
    By avoiding the redundancy of the material codes, CODA ML-MDM ensures that there are no duplicate, unused and unleveraged inventories in the warehouse, and thereby enables huge cost-savings through avoiding wastages in the inventory.
  4. Save redundant inventory cost
    CODA ML-MDM provides a search tool which can provide additional information about a material code such as its original manufacturer even if such data-elements are not available with the organization.


CODA ML-MDM is a new-generation material data management system which helps professionals in large and medium enterprise organize the information system records for voluminous materials and Equipments. CODA ML-MDM reduces the time and effort to constantly update the material management systems, and enables Material management professionals to focus on their core competency of supply chain management.


Leave a Reply

Your email address will not be published. Required fields are marked *