SMACOM's unique "Securities Report Sentiment Score"

In March 2021, Nikkei Financial Technology Research Institute (also known as Nikkei FTRI) began offering services related to the new SMACOM information distribution platform, which provides corporate analytical information (https://www.nikkei.co.jp/nikkeiinfo/en/global_services/nikkei-ftri/providing-services-for-the-new-smacom-information-distribution-platform.html). SMACOM makes valuable information available to support the decision-making processes of leading financial market professionals, and it evaluates more than 40,000 companies in 51 major economies. Nikkei FTRI is characterized by its effective use of alternative data, AI, sophisticated technologies and unique analytical skills.

Here, we would like to shed some light on the Securities Report Sentiment Score (or SR Sentiment Score), which is one of the metrics delivered by SMACOM. Simply put, the SR Sentiment Score is useful for investment decision making based on the analysis of text data from Japanese annual securities reports (or yukashoken hokokusho).

The Annual Securities Report is one of the annual reports that listed companies are required to file, equivalent to a 10-K in the US. It contains information necessary for making various investment decisions. Although its main content is financial data (i.e., numerical data), the majority of the information is written in text form. Financial data can be easily obtained by purchasing it from various vendors, and since it is numerical data, it can be analyzed without excessive difficulty.

On the other hand, alternative data, such as text data from Annual Securities Report, requires data acquisition and pre-processing before analysis can begin, placing a heavy burden on the analyst. In addition, the analysis of text data is more complex and time-consuming than the analytical process for numerical data.

However, we already have an environment for extracting, (https://www.nikkei.co.jp/nikkeiinfo/en/global_services/nikkei-ftri/smacoms-unique-news-sentiment-score.html) pre-processing and analyzing text data from securities reports, based on our knowledge of text data analysis methods acquired through the creation of the News Sentiment Score. Therefore, it is possible to easily obtain and analyze textual information from the securities reports of Japanese listed companies, and we have succeeded in developing an innovative score that can be used for investment decision making.

The model for this score was developed by combining various grammatical and other factors, such as the readability of the text in the securities report and the level of vocabulary. Specifically, factors taken into account include the percentage of negative forms, hypothetical forms and parts of speech included in the text. Moreover, the entire text is evaluated from other perspectives such as the difficulty level of the kanji characters used in the text and the diversity of the vocabulary, which indicates how many different words and phrases are used in a text of the same length. By examining texts in these different ways, it is possible to precisely gauge the differences in the texts of securities reports, which is difficult even for human readers. The securities report score developed using natural language processing technology can be easily used by personnel at overseas hedge funds and other institutions whose first language is not Japanese. It can also be used by Japanese speakers as reference material for investment decision-making that makes it unnecessary to read the Japanese text.

The following graph shows the performance of a portfolio with a score of 70 or more in the past year's data (from 2018 to 2021) as long, and a portfolio with a score of 30 or less as short, measured after 20 days. As you can see from the graph, the securities report score (blue line) outperforms the TOPIX (orange line), and it is evident that this is a very effective score.

