As the registry for national domain names, CNNIC actively carries out detection for domain name abuse, and ensure netizen's privacy and property safety, contributing to the construction of trusted Internet environment.
Phishing Detection
Anti-phishing technology research & development by CNNIC: utilizing resources advantages to actively find phishing websites that harm Chinese brands on the Internet, rather than adopting the traditional passive anti-phishing mechanism.
The Unique Resource Advantages of CNNIC
? As the registry of country code top-level domain names .CN and .中国. CNNIC, possessing rich network data resources, provides top-level domain name resolution services and recursive domain name resolution services.
? Serving as the Secretariat of Anti-phishing Alliance of China (APAC), CNNIC is able to get the reported phishing data of APAC, which can provide data support for the research.
Phishing Detection Technology:
By carrying out large scale data analysis on all the TLD hosts that conduct DNS recursive resolution, and combining prior knowledge of phishing reporting, CNNIC can find phishing websites quickly and accurately. The following technologies are used comprehensively: domain name similarity detection, DNS log mining, statistical analysis of reported phishing data, automatic generation of phishing URL, reverse IP lookup technology, etc, to undertake fine-grained and multi-perspective judge of phishing websites.
Scale and Procedure of Daily Processed Data
CNNIC carries out mining analysis of 400 million DNS recursive resolution data every day, with the dedicated server reaching tens of millions. From source data push, data preprocessing to phishing website judgment and phishing evidence retention, all steps are automatically processed by machine. Besides, with the help of APAC’s fast track, CNNIC can assure first-time review and handling of the found phishing websites.
Pornography Detection
Overview: CNNIC has independently developed the automatic Internet pornographic information discovery and judging system with independent intellectual property rights. By taking full advantage of the enormous national domain name registration and resolution data of CNNIC, the system can discover pornographic national domain names rapidly and accurately every day to ensure that the pornographic websites are discovered and shut down rapidly when they are only accessed by few Internet users or even before they have been accessed by users so as to safeguard the reputation of national domain names.
Active discovery technology of pornographic web pages: Large scale data analysis of domain name registration and authoritative DNS resolution, and active host page capture and analysis, can assure rapid and accurate discovery of pornographic national domain names. The system covers the following technologies: large scale log analysis and processing, fast Bayes detection, forward cheating detection technology, hidden cheating detection technology, page co-occurrence words analysis technology, automatic evidence capture technology.etc.
Scale of Data: number of raw data processed every day is about 1.6 billion. After de-duplication, independent hosts are about 20 million per day. The system carries out page analysis and judging of these hosts every day.