使用 OpenGrok搭建大型源码阅读环境
使用 OpenGrok搭建大型源码阅读环境
官方wiki 简单介绍了OpenGrok的搭建过程, 参考https://github.com/oracle/opengrok/wiki/How-to-setup-OpenGrok
在自己的实践过程中,还是会遇到一些小问题,记录下来以避免后人继续踩坑。
本文以最新版ubuntu 22.10为例(实测ubuntu 22.10搭建AOSP编译环境完全没有任何问题)

- tomcat的安装
tomcat需要10.x版本, apt源中的版本是9.x,因此需要手动安装
手动安装步骤和相关脚本,可以参考https://github.com/lashwang2022/tomcat-installation-ubuntu/blob/main/install-tomcat-ubuntu.sh
- opengrok的安装和配置
官方提供的命令
opengrok-deploy -c /opengrok/etc/configuration.xml \/opengrok/dist/lib/source.war /var/lib/tomcat8/webapps
-
需要注意python的版本问题, opengrok-deploy需要在python3环境下使用,自己可以使用pyven创建虚拟python环境.
-
tomcat路径需要修改为实际安装的路径
- 源码准备
可以通过软连接将需要索引的源码链接到opengrok安装的src目录, 不需要将源码放到src目录下。软链的好处是添加删除项目也非常方便
- 源码索引
OpenGrop索引的核心就是opengrok.jar, 可以通过执行”java -jar /opengrok/dist/lib/opengrok.jar -h” 查看支持的参数
simon@simon-ubuntu-server:~$ java -jar /opt/opengrok/dist/lib/opengrok.jar -h
Jan 19, 2023 5:05:40 PM org.opengrok.indexer.index.Indexer parseOptions
INFO: Indexer options: [-h]Usage: java -jar opengrok.jar [options] [subDir1 [...]]-h, -?, --help [mode]With no mode specified, display this usage summary. Or specify a mode:config - display configuration.xml examples.ctags - display ctags command-line.guru - display AnalyzerGuru details.repos - display enabled repositories.--annotationCache on|offAnnotation cache provides speedup when getting annotationfor files in the webapp at the cost of significantly increasedindexing time (multiple times slower) and slightly increaseddisk space (comparable to history cache size).Can be enabled per project.--apiTimeout numberSet timeout for asynchronous API requests.--connectTimeout numberSet connect timeout. Used for API requests.-A, --analyzer (.ext|prefix.):(-|analyzer)Associates files with the specified prefix or extension (case-insensitive) to be analyzed with the given analyzer, where 'analyzer'may be specified using a class name (case-sensitive e.g. RubyAnalyzer)or analyzer language name (case-sensitive e.g. C). Option may berepeated.Ex: -A .foo:CAnalyzerwill use the C analyzer for all files ending with .FOOEx: -A bar.:Perlwill use the Perl analyzer for all files starting with"BAR" (no full-stop)Ex: -A .c:-will disable specialized analyzers for all files ending with .c-c, --ctags /path/to/ctagsPath to Universal Ctags. Default is ctags in environment PATH.--canonicalRoot /path/Allow symlinks to canonical targets starting with the specified rootwithout otherwise needing to specify -N,--symlink for such symlinks. Acanonical root must end with a file separator. For security, a canonicalroot cannot be the root directory. Option may be repeated.--checkIndexCheck index, exit with 0 on success,with 1 on failure.-d, --dataRoot /path/to/data/rootThe directory where OpenGrok stores the generated data.--depth numberScanning depth for repositories in directory structure relative tosource root. Default is 3.--disableRepository type_nameDisables operation of an OpenGrok-supported repository. See also-h,--help repos. Option may be repeated.Ex: --disableRepository gitwill disable the GitRepositoryEx: --disableRepository MercurialRepository-e, --economicalTo consume less disk space, OpenGrok will not generate and savehypertext cross-reference files but will generate on demand, which couldbe slightly slow.-G, --assignTagsAssign commit tags to all entries in history for all repositories.-HEnable history.--historyBased on|offIf history based reindex is in effect, the set of fileschanged/deleted since the last reindex is determined from historyof the repositories. This needs history, history cache andprojects to be enabled. This should be much faster than theclassic way of traversing the directory structure.The default is on. If you need to e.g. index files untracked bySCM, set this to off. Currently works only for Git.All repositories in a project need to support this in orderto be indexed using history.--historyThreads numberThe number of threads to use for history cache generation on repository level.By default the number of threads will be set to the number of available CPUs.Assumes -H/--history.--historyFileThreads numberThe number of threads to use for history cache generationwhen dealing with individual files.By default the number of threads will be set to the number of available CPUs.Assumes -H/--history.-I, --include patternOnly files matching this pattern will be examined. Pattern supportswildcards (example: -I '*.java' -I '*.c'). Option may be repeated.-i, --ignore patternIgnore matching files (prefixed with 'f:' or no prefix) or directories(prefixed with 'd:'). Pattern supports wildcards (example: -i '*.so'-i d:'test*'). Option may be repeated.-l, --lock on|off|simple|nativeSet OpenGrok/Lucene locking mode of the Lucene database during indexgeneration. "on" is an alias for "simple". Default is off.--leadingWildCards on|offAllow or disallow leading wildcards in a search. Default is on.-m, --memory numberAmount of memory (MB) that may be used for buffering added documents anddeletions before they are flushed to the directory (default 16.0).Please increase JVM heap accordingly too.--mandoc /path/to/mandocPath to mandoc(1) binary.-N, --symlink /path/to/symlinkAllow the symlink to be followed. Other symlinks targeting the samecanonical target or canonical children will be allowed too. Option maybe repeated. (By default only symlinks directly under the source rootdirectory are allowed. See also --canonicalRoot)-n, --noIndexDo not generate indexes and other data (such as history cache and xreffiles), but process all other command line options.--nestingMaximum numberMaximum depth of nested repositories. Default is 1.--reduceSegmentCountReduce the number of segments in each index database to 1. This might(or might not) bring some improved performance. Anyhow, this operationtakes non-trivial time to complete.-o, --ctagOpts pathFile with extra command line options for ctags.-P, --projectsGenerate a project for each top-level directory in source root.-p, --defaultProject path/to/default/projectPath (relative to the source root) to a project that should be selectedby default in the web application (when no other project is set eitherin a cookie or in parameter). Option may be repeated to specify severalprojects. Use the special value __all__ to indicate all projects.--profilerPause to await profiler or debugger.--progressPrint per-project percentage progress information.-Q, --quickScan on|offTurn on/off quick context scan. By default, only the first 1024KB of afile is scanned, and a link ('[..all..]') is inserted when the file isbigger. Activating this may slow the server down. (Note: this settingonly affects the web application.) Default is on.-q, --quietRun as quietly as possible. Sets logging level to WARNING.-R /path/to/configurationRead configuration from the specified file.-r, --remote on|off|uionly|dirbasedSpecify support for remote SCM systems.on - allow retrieval for remote SCM systems.off - ignore SCM for remote systems.uionly - support remote SCM for user interface only.dirbased - allow retrieval during history index only for repositorieswhich allow getting history for directories.--renamedHistory on|offEnable or disable generating history for renamed files.If set to on, makes history indexing slower for repositorieswith lots of renamed files. Default is off.--repository [path/to/repository|@file_with_paths]Path (relative to the source root) to a repository for generatinghistory (if -H,--history is on). By default all discovered repositoriesare history-eligible; using --repository limits to only those specified.File containing paths can be specified via @path syntax.Option may be repeated.-S, --search [path/to/repository|@file_with_paths]Search for source repositories under source root (-s,--source),and add them. Path (relative to the source root) is optional.File containing the paths can be specified via @path syntax.Option may be repeated.-s, --source /path/to/source/rootThe root directory of the source tree.--style pathPath to the subdirectory in the web application containing the requestedstylesheet. The factory-setting is: "default".-T, --threads numberThe number of threads to use for index generation, repository scanand repository invalidation.By default the number of threads will be set to the number of availableCPUs. This influences the number of spawned ctags processes as well.-t, --tabSize numberDefault tab size to use (number of spaces per tab character).--token string|@file_with_stringAuthorization bearer API token to use when making API callsto the web application-U, --uri SCHEME://webappURI:port/contextPathSend the current configuration to the specified web application.--updateConfigPopulate the web application with a bare configuration, and exit.--userPage URLBase URL of the user Information provider.Example: "https://www.example.org/viewProfile.jspa?username=".Use "none" to disable link.--userPageSuffix URL-suffixURL Suffix for the user Information provider. Default: "".-V, --versionPrint version, and quit.-v, --verboseSet logging level to INFO.-W, --writeConfig /path/to/configurationWrite the current configuration to the specified file (so that the webapplication can use the same configuration).--webappCtags on|offWeb application should run ctags when necessary. Default is off.
- 忽略文件和目录
AOSP同步一次耗费数个小时,可以指定脚本忽略某些目录的索引,比如out/toolchain等等,如下是本人使用时忽略的目录。
如果需要忽略文件, 将d改成f
d:out
d:prebuilts
d:cts
d:platform_testing
d:autotest
d:*old_codebase*
d:toolchain
d:rockdev
d:pdk
d:.repo
-
源码和索引的定期更新
可以将索引的命令添加到crontab做成定期任务自动更新 -
源码存储空间问题
由于AOSP一个项目就几百个G
欢迎关注我的公众号“虎哥 LoveDroid”,原创技术文章第一时间推送。
本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!
