如何找 linux 指令的原始碼

上篇時, 為了了解 xargs 如何避開 argument too long 的問題, 我決定找 xargs 的原始碼來看。之前沒有這類經驗, google 一些關鍵字後毫無成果。

靈機一動, 想到初學者的好朋友 Stackoverflow。果然在那裡一搜 "linux download source code" 馬上找到解答: 在 man page 裡找出該指令屬於那個專案。像 man xargs 會發現文末這段話:
The best way to report a bug is to use the form at http://savannah.gnu.org/bugs/?group=findutils. The reason  for this is that you will then be able to track progress in fixing the problem. Other comments about xargs(1) and about the findutils package in general can be sent to the bug-findutils mailing list. To join the list, send email to bug-findutils-request@gnu.org.
於是 google findutils 就會找到它的官網, 裡面提到此專案包含 xargs, 附有下載鏈結。

在 Ubuntu 的作法

用 apt-get source PACKAGE 下載原始碼。前置作業如下:
  1. 確定 /etc/apt/sources.list 內有含 deb-src, 比方說加入這兩行:

    deb-src http://us.archive.ubuntu.com/ubuntu jaunty main restricted
    deb-src http://us.archive.ubuntu.com/ubuntu jaunty universe multiverse
  2. sudo apt-get update
可以用 aptitude show PACKAGE / COMMAND 來查 PACKAGE 名稱, 比方「 aptitude show host 」會告知 host 被包在 dnsutils 裡。或是用 apt-file search FILE 來找 PACKAGE, 像是「 apt-file search /usr/bin/xargs 」會列出 findutils。

留言

  1. The easier and more common answer to "how to get the source of Linux utilities" is to learn your Linux distributions package manager, ex under Fedora:

    $ rpmquery -f /usr/bin/xargs
    findutils-4.4.2-6.fc12.x86_64
    $ yumdownloader --source findutils
    < ... snip ... >
    findutils-4.4.2-6.fc12.src.rpm
    $ rpmdev-setuptree
    $ rpm -i findutils-4.4.2-6.fc12.src.rpm
    $ yum-builddep # installs packages required to build findutils
    $ cd $HOME/rpmbuild
    $ rpmbuild -bp SPECS/findutils
    $ vi BUILD/findutils-4.4.2/xargs/xargs.c

    See: http://itrs.tw/wiki/RPM_DPKG_Rosetta_Stone for the equivalent commands on Debian.

    回覆刪除
  2. Thanks.

    By the way, I haven't installed Linux on my home desktop PC, so I cannot use the common way.

    Sometimes I think I should buy a new PC and install a Linux on it.

    回覆刪除
  3. This raises another question: I claim that host(1) does not call gethostbyname(3)/gethostbyaddr(3)/getnameinfo(3)/getaddrinfo(3) but getent(1) does. Can you easily verify or refute that claim?

    回覆刪除
  4. I thought four methods:

    1. According to past experiences, I think I can trust your claim. Thus, I don't need to verify that. Just kidding. :)

    2. Add hooks in the suspected functions and recompile related codes. Then run and see if the hooks are called.

    3. Use profilers and see function call logs, though I have no experience of using C profilers. I guess this one is the easiest if I'm familiar with some C profiler.

    4. Trace related codes of host.c and be aware of macros and included headers. I thought this way is slightly unreliable and hard.

    Before I start the stupid trials tonight, would you provide some comments in advance? :)

    回覆刪除
  5. Methods 2 and 4 require recompiling system software thus are not particularly easy.

    Method 3 is problematic because profilers by default only report the functions that uses the most CPU and "gethostbyaddr" is definitely not a top CPU user. (The most popular and useful Linux system profilers are oprofile and sysprof. The new 'perf' profiler will likely replace them in a few years)

    Solutions:
    1. strace(1): lists all system calls used
    2. ltrace(1): lists all shared library calls
    3. gdb(1): set a breakpoint on the system function you want to monitor.

    回覆刪除
  6. Got it. Thanks. To prevent I forget that, I'll write another blog post. XD

    回覆刪除

張貼留言

這個網誌中的熱門文章

(C/C++ ) 如何在 Linux 上使用自行編譯的第三方函式庫

熟悉系統工具好處多多

virtualbox 使用 USB 裝置