ABAP Text Search

There is a difference between searching and finding, so implementing a plain text search in ABAP is more challenging than querying a table attribute. The user-specified search pattern cannot be matched using the simple test for equality used for attributes. Sophisticated plain text matching algorithms are needed to make it attractive for users attuned to Google-search:

  • the Full Text Search (FTS) matches a pattern exactly.
  • the Fuzzy search performs approximate string matching according to a distance algorithm tuned to a problem domain that finds both exact matches and deviations.
  • the Acoustic search is an approximate String matching according to a phonetic algorithm.

Such an user friendly, fault-tolerant search can be made fast efficiently: the approach is pre-process text and store the result in an index table.  This is done behind the scene in the SAP Enterprise Search Tool. But what other options do the ABAP developer have to search for text in tables?

ABAP SQL

A simple pattern matching is enabled by the SQL LIKE statementSELECT product FROM catalog WHERE name LIKE '%_agento%'.

Long texts saved in tables STXH and STXL are accessed with function modules like READ_TEXT that can also read the text buffer and the archive. So this approach does not allow searching for long text. Fuzzy search is not possible in ABAP SQL.

If you do not have TREX of SAP Enterprise Search enabled, then generic text search might not be high on your agenda.

HANA DB

The HANA DB (HDB) supports FTS, Fuzzy and Acoustic search (cf. search, analysis and mining).

I discovered CONTAINS( ) predicate in HANA DB as I wanted to implement a text search in ABAP. Note the prerequisite is to create a full-text index. This can be done in the dictionary for HDB only. The alternative is to define the column as TEXT at HDB table definition level.

My approach for long text search was to duplicate the long text content into a HDB TEXT column. This in effect creates an index, enabling queries using CONTAINS( ).

CONTAINS( ) queries were implemented using ABAP Database Connectivity (ADBC). I came to realize Fuzzy Text search could be a Killer App for ADBC. It works! (and yes, you can also use AMDP.

The ADBC methods can be used to

  • create and manage connections to all database maintained in transaction DBACOCKPIT (table DBCON).
  • execute DB specific statements
  • process the query results

Known Problems:

  • No automatic client handling
  • Security risk: SQL Injection.

Important ADBC Classes

  • CL_SQL_CONNECTION
  • CL_SQL_STATEMENT
    • execute_ddl( ) – CREATE TABLE, DROP TABLE
    • execute_query( ) – SELECT FROM WHERE
    • execute_update( ) – INSERT INTO VALUES ( )
  • CL_SQL_PREPARED_STATEMENT
    • prepare_statement( )
  • CL_SQL_RESULT_SET – Result set of an SQL query
    • set_param_table( )
    • next_package( )
    • close( )
  • CX_SQL_EXCEPTION
  • CL_SHDB_SELTAB method sql_where_condition( ) and the static method COMBINE_SELTABS( ) shall be used to convert SELECT-OPTIONS into Native SQL.
  • Demo Report DEMO_ADBC_DDL_DML

Sample Code

generates SQL statement for fuzzy text search for the pattern TEST OR PALLETSELECT TOP 200 DISTINCT SCORE() AS FUZZY_SCORE, K.EBELN, P.EBELP FROM EKKO AS K INNER JOIN EKPO AS P ON K.EBELN = P.EBELN AND K.MANDT = P.MANDT INNER JOIN LFA1 AS L ON K.LIFNR = L.LIFNR AND K.MANDT = L.MANDT LEFT OUTER JOIN ADRC AS A ON L.ADRNR = A.ADDRNUMBER AND L.MANDT = A.CLIENT WHERE K.MANDT = '140' AND A.NATION = '' AND CONTAINS(P.TXZ01, 'TEST OR PALLET', FUZZY(0.8, 'textSearch=compare,similarCalculationMode=search' )) AND K.EKORG = 'GLOB' AND K.KDATB <= '20190213' AND K.KDATE >= '20190424' AND K.LOEKZ = '' AND P.LOEKZ = '' AND SCORE() >= 0.20000000 ORDER BY FUZZY_SCORE DESC

Sybase

After this successful endeavour, I searched the web and realized the CONTAINS( ) predicate is also available on other platforms. The SAP Netweaver demo system NPL runs on Sybase. ABAP does not support full text index for other DBs.

Index ZTOAD-Y01 is of type Full Text. This is not implemented for DB SYB

Message no. DT558

But Sybase also support the CONTAINS( ) statement for exact and fuzzy text search.

So how to enable the indexing? The help says “NGRAM TEXT index search is mainly useful when words are misspelled. Sybase IQ does not support searches like synonyms and antonyms.” for Sybase (SYB).

Oracle DB

Oracle supports fuzzy text search with a variants of the CONTAINS( ) statement.

MS SLQ

MS SQL Server supports full-text searching with the CONTAINS( ) statement.

Other DBs

On other systems, you could try to use a custom of the string distance functions as a DB procedure if not already available.

  • Here is a T-SQL (MS SQL server) implementation of the Levenshtein distance.
  • MySQL has Full Text Search WHERE MATCH(column) AGAINST('Rose', 'Crown') and the SOUNDEX( ) function for acoustic search

SELECT * FROM `author_bios` WHERE SOUNDEX(`auth_last_name`) = SOUNDEX('Williams')

Summary

The CONTAINS( ) statement is very helpful for text search, but it is not supported in Open SQL. However, variants of this statements are supported by various databases. With this, Fuzzy Text search could be implemented using ABAP DB Connectivity (ADBC). It works well for SAP HANA and I think it can be made to work for other platforms.

So ABAP Text Search seems to be the poster use case for ADBC.


Interested in working in the SAP/ABAP/HANA environment?

Please follow up on our career page.