Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

alt

Daniel Nashed

Apache Tika 3.0.0 released - Available in the Domino Container

Daniel Nashed – 24 November 2024 14:01:58

Apache Tika is a Java based project leveraged in Domin to parse text from attachments when full text indexing using the search filters.
It's a single JAR running as a separate process listening on the loopback interface to perform attachment parsing.

Tika could actually also be used for your own applications, if you start another instance.
I blogged about it some time ago --> https://blog.nashcom.de/nashcomblog.nsf/dx/tika-in-notesdomino.htm.

Domino 14.5 EA1 and 14.0 FP2 containes the latest stable Tika Server 2.9.2 release.
Now that Tika 3.0.0 is finally released, you can expect Domino 14.5 also to switch to the new major version.

The container project provides a build option to replace the Tika version
I have just updated Tika to 3.0.0 in the container build and did a quick test.



Image:Apache Tika 3.0.0 released - Available in the Domino Container


Comments

1Christian Henseler  24.11.2024 18:16:28  Apache Tika 3.0.0 released - Available in the Domino Container

Thanks for the info, but does HCL support the replacement of Tika for Domino officially?

2Daniel Nashed  24.11.2024 18:57:20  Apache Tika 3.0.0 released - Available in the Domino Container

@Christian Henseler, there are no interface changes in Tika 3.0.0.

There have been REST API differences when moving from 1.x to 2.x. But 3.x does not change the interface.

The connection to Tika is a REST API connection and continues to work.

As soon Domino 14.5 has Tika 3.0.0, it will also be safe to use for older versions.

"Supported" and "fully tested by HCL" for an older version are different stories.

But you should still get support for a later version of Tika on an earlier Domino release.

3Jan Peter  27.11.2024 9:21:41  Apache Tika 3.0.0 released - Available in the Domino Container

the analysis and performance troubleshooting tool can be found on OpenNTF?

4Daniel Nashed  27.11.2024 21:36:22  Apache Tika 3.0.0 released - Available in the Domino Container

@Jan

which tool are you refering to? this blog post was about the new Tika version.

I do have a tool I wrote for a customer to troubleshoot Tika problems.

The servertask scans databases for attachments, sends them to Tika and gets all results including CPU consumption.

This isn't an open source nor free tool I wrote.

I have a ton of open source/free software. But I can't do everything I do for free.

The tool only is available with consulting because the topic is ticky.

Links

    Archives


    • [HCL Domino]
    • [Domino on Linux]
    • [Nash!Com]
    • [Daniel Nashed]