The UpGuard research team can now report an ElasticSearch instance used as data storage for the debt collection system ENCollect has been secured. The server contained data about loans from multiple Indian and African financial services companies that had apparently been sent to ENCollect for collection. The data totalled 5.8GB in storage size and contained a total of 1,686,363 records. Those records included personal information like name, loan amount, date of birth, account number, and more. A total of 48,043 unique email addresses were in the collection, some of which were for the product administrators, corporate clients, and collection agents assigned to each case.
On February 16, 2022, UpGuard analysts detected the database, identified the likely sensitive nature of the data, and attempted to contact Entiger, the maker of the ENCollect product. An email sent to the address listed on their website was returned as undeliverable. A notification was also submitted through the demo form and chat widget on the site. After no response, an analyst sent notification to Sumeru US, part of the larger Sumeru company from which Sumeru Enterprise Tiger Business Solutions had been spun out, on February 17. After that received no response, an analyst submitted an abuse complaint to Microsoft, the hosting provider, on February 18. On February 22 the database was still accessible, and an analyst sent notification to the Indian Computer Emergency Response Team, who promptly replied. On Monday, February 28 the database was no longer accessible.
ENCollect is a loan collection app, providing agents with data on outstanding loans to be sent to collections. Those loans appear to have originated with lenders named in the database indices and in the data itself, including Lendingkart, IndiaLends, Shubh Loans (MyShubhLife), Centrum, Rosabo, and Accion. The data also included information about the borrower such that the collections agent could identify them. About half of the indices included “prod” in their name, indicating they were for or from a production environment.
The leaked data included significant amounts of PII for loan recipients. The amount of personal data within each record varied. After removing both null and duplicate values, the exposed data contained 114,747 mailing addresses, 105,974 phone numbers, and 48,043 email addresses. A relatively small amount of records (5-10%) exposed more data points including the contact details of co-applicants, family members, and/or personal references. The vast majority of email addresses were for free personal mail providers like gmail.com and yahoo.com, but the most frequent domains also included some business accounts, like UBA Group and Exxon.
Some loan data was available, including 157,403 loan amounts. Some records contained overdue amounts, the type and length of the loan, and internal notes left by collection agency staff regarding loan repayments.
Taken together, this data set provides information that could put individuals and in some cases their employers at risk. The most obvious usage would be to impersonate the loan collectors and target the borrowers with personal information about them, their family, and their outstanding debts, potentially in a form of man-in-the-middle attack. Beyond that, knowledge that a person is in debt provides the financial motive that is most likely to lead employees to mishandle funds– the reason that credit histories are often part of an employment background check.
The same IP address also had a Kibana frontend that allowed access without authentication. Here again, the dashboards were marked as either demo or “PRD” for the production data sets. Because the database was publicly accessible, this misconfiguration did not expose any additional information, but illustrates another point of failure that must be secured to avoid unintended data disclosure.
The digitization of financial services provides many opportunities for efficiencies in processes like debt collection, but also creates unexpected risks in the supply chain. Vendor solutions also create the risk for multiparty exposures when their data sets are sourced from several clients, as in this case.
It is also worth noting that while we do not know the exact series of events led to securing the database, the existence of the Indian Computer Emergency Response Team (CERT-In) and their response capabilities provide a valuable safety net for securing data exposures. It is not uncommon for companies to miss notification emails from researchers and from their hosting providers. Recently, CERT-In published guidelines requiring organizations inform them within 6 hours of a data breach notification. Having a governmental entity so receptive to leak notifications provides another venue for efforts to secure exposures.