Inheritance diagram for elf_symbolizer.ELFSymbolizer:

Classes
class	Addr2Line

Public Member Functions
	__init__ (self, elf_file_path, addr2line_path, callback, inlines=False, max_concurrent_jobs=None, addr2line_timeout=30, max_queue_size=50, source_root_path=None, strip_base_path=None)

	SymbolizeAsync (self, addr, callback_arg=None)

	Join (self)

Public Attributes
	elf_file_path

	addr2line_path

	callback

	inlines

	max_concurrent_jobs

	max_queue_size

	addr2line_timeout

	requests_counter

	disambiguate

	disambiguation_table

	strip_base_path

	source_root_path

Protected Member Functions
	_CreateNewA2LInstance (self)

	_CreateDisambiguationTable (self)

Protected Attributes
	_a2l_instances

Detailed Description

An uber-fast (multiprocessing, pipelined and asynchronous) ELF symbolizer.

This class is a frontend for addr2line (part of GNU binutils), designed to
symbolize batches of large numbers of symbols for a given ELF file. It
supports sharding symbolization against many addr2line instances and
pipelining of multiple requests per each instance (in order to hide addr2line
internals and OS pipe latencies).

The interface exhibited by this class is a very simple asynchronous interface,
which is based on the following three methods:
- SymbolizeAsync(): used to request (enqueue) resolution of a given address.
- The |callback| method: used to communicated back the symbol information.
- Join(): called to conclude the batch to gather the last outstanding results.
In essence, before the Join method returns, this class will have issued as
many callbacks as the number of SymbolizeAsync() calls. In this regard, note
that due to multiprocess sharding, callbacks can be delivered out of order.

Some background about addr2line:
- it is invoked passing the elf path in the cmdline, piping the addresses in
its stdin and getting results on its stdout.
- it has pretty large response times for the first requests, but it
works very well in streaming mode once it has been warmed up.
- it doesn't scale by itself (on more cores). However, spawning multiple
instances at the same time on the same file is pretty efficient as they
keep hitting the pagecache and become mostly CPU bound.
- it might hang or crash, mostly for OOM. This class deals with both of these
problems.

Despite the "scary" imports and the multi* words above, (almost) no multi-
threading/processing is involved from the python viewpoint. Concurrency
here is achieved by spawning several addr2line subprocesses and handling their
output pipes asynchronously. Therefore, all the code here (with the exception
of the Queue instance in Addr2Line) should be free from mind-blowing
thread-safety concerns.

The multiprocess sharding works as follows:
The symbolizer tries to use the lowest number of addr2line instances as
possible (with respect of |max_concurrent_jobs|) and enqueue all the requests
in a single addr2line instance. For few symbols (i.e. dozens) sharding isn't
worth the startup cost.
The multiprocess logic kicks in as soon as the queues for the existing
instances grow. Specifically, once all the existing instances reach the
|max_queue_size| bound, a new addr2line instance is kicked in.
In the case of a very eager producer (i.e. all |max_concurrent_jobs| instances
have a backlog of |max_queue_size|), back-pressure is applied on the caller by
blocking the SymbolizeAsync method.

This module has been deliberately designed to be dependency free (w.r.t. of
other modules in this project), to allow easy reuse in external projects.

Definition at line 25 of file elf_symbolizer.py.

Constructor & Destructor Documentation

◆ init()

elf_symbolizer.ELFSymbolizer.__init__	(	self,
		elf_file_path,
		addr2line_path,
		callback,
		inlines = `False`,
		max_concurrent_jobs = `None`,
		addr2line_timeout = `30`,
		max_queue_size = `50`,
		source_root_path = `None`,
		strip_base_path = `None`
	)

Args:
elf_file_path: path of the elf file to be symbolized.
addr2line_path: path of the toolchain's addr2line binary.
callback: a callback which will be invoked for each resolved symbol with
  the two args (sym_info, callback_arg). The former is an instance of
  |ELFSymbolInfo| and contains the symbol information. The latter is an
  embedder-provided argument which is passed to SymbolizeAsync().
inlines: when True, the ELFSymbolInfo will contain also the details about
  the outer inlining functions. When False, only the innermost function
  will be provided.
max_concurrent_jobs: Max number of addr2line instances spawned.
  Parallelize responsibly, addr2line is a memory and I/O monster.
max_queue_size: Max number of outstanding requests per addr2line instance.
addr2line_timeout: Max time (in seconds) to wait for a addr2line response.
  After the timeout, the instance will be considered hung and respawned.
source_root_path: In some toolchains only the name of the source file is
  is output, without any path information; disambiguation searches
  through the source directory specified by |source_root_path| argument
  for files whose name matches, adding the full path information to the
  output. For example, if the toolchain outputs "unicode.cc" and there
  is a file called "unicode.cc" located under |source_root_path|/foo,
  the tool will replace "unicode.cc" with
  "|source_root_path|/foo/unicode.cc". If there are multiple files with
  the same name, disambiguation will fail because the tool cannot
  determine which of the files was the source of the symbol.
strip_base_path: Rebases the symbols source paths onto |source_root_path|
  (i.e replace |strip_base_path| with |source_root_path).

Definition at line 77 of file elf_symbolizer.py.

                 strip_base_path=None):
        """Args:
      elf_file_path: path of the elf file to be symbolized.
      addr2line_path: path of the toolchain's addr2line binary.
      callback: a callback which will be invoked for each resolved symbol with
          the two args (sym_info, callback_arg). The former is an instance of
          |ELFSymbolInfo| and contains the symbol information. The latter is an
          embedder-provided argument which is passed to SymbolizeAsync().
      inlines: when True, the ELFSymbolInfo will contain also the details about
          the outer inlining functions. When False, only the innermost function
          will be provided.
      max_concurrent_jobs: Max number of addr2line instances spawned.
          Parallelize responsibly, addr2line is a memory and I/O monster.
      max_queue_size: Max number of outstanding requests per addr2line instance.
      addr2line_timeout: Max time (in seconds) to wait for a addr2line response.
          After the timeout, the instance will be considered hung and respawned.
      source_root_path: In some toolchains only the name of the source file is
          is output, without any path information; disambiguation searches
          through the source directory specified by |source_root_path| argument
          for files whose name matches, adding the full path information to the
          output. For example, if the toolchain outputs "unicode.cc" and there
          is a file called "unicode.cc" located under |source_root_path|/foo,
          the tool will replace "unicode.cc" with
          "|source_root_path|/foo/unicode.cc". If there are multiple files with
          the same name, disambiguation will fail because the tool cannot
          determine which of the files was the source of the symbol.
      strip_base_path: Rebases the symbols source paths onto |source_root_path|
          (i.e replace |strip_base_path| with |source_root_path).
    """
        assert (os.path.isfile(addr2line_path)), 'Cannot find ' + addr2line_path
        self.elf_file_path = elf_file_path
        self.addr2line_path = addr2line_path
        self.callback = callback
        self.inlines = inlines
        self.max_concurrent_jobs = (max_concurrent_jobs or
                                    min(multiprocessing.cpu_count(), 4))
        self.max_queue_size = max_queue_size
        self.addr2line_timeout = addr2line_timeout
        self.requests_counter = 0  # For generating monotonic request IDs.
        self._a2l_instances = []  # Up to |max_concurrent_jobs| _Addr2Line inst.
 
        # If necessary, create disambiguation lookup table
        self.disambiguate = source_root_path is not None
        self.disambiguation_table = {}
        self.strip_base_path = strip_base_path
        if (self.disambiguate):
            self.source_root_path = os.path.abspath(source_root_path)
            self._CreateDisambiguationTable()
 
        # Create one addr2line instance. More instances will be created on demand
        # (up to |max_concurrent_jobs|) depending on the rate of the requests.
        self._CreateNewA2LInstance()
 

Member Function Documentation

◆ _CreateDisambiguationTable()

elf_symbolizer.ELFSymbolizer._CreateDisambiguationTable ( self )

protected

 Non-unique file names will result in None entries

Definition at line 193 of file elf_symbolizer.py.

    def _CreateDisambiguationTable(self):
        """ Non-unique file names will result in None entries"""
        start_time = time.time()
        logging.info('Collecting information about available source files...')
        self.disambiguation_table = {}
 
        for root, _, filenames in os.walk(self.source_root_path):
            for f in filenames:
                self.disambiguation_table[f] = os.path.join(
                    root, f) if (f not in self.disambiguation_table) else None
        logging.info(
            'Finished collecting information about '
            'possible files (took %.1f s).', (time.time() - start_time))
 

◆ _CreateNewA2LInstance()

elf_symbolizer.ELFSymbolizer._CreateNewA2LInstance ( self )

protected

Definition at line 187 of file elf_symbolizer.py.

    def _CreateNewA2LInstance(self):
        assert (len(self._a2l_instances) < self.max_concurrent_jobs)
        a2l = ELFSymbolizer.Addr2Line(self)
        self._a2l_instances.append(a2l)
        return a2l
 

◆ Join()

elf_symbolizer.ELFSymbolizer.Join ( self )

Waits for all the outstanding requests to complete and terminates.

Definition at line 181 of file elf_symbolizer.py.

    def Join(self):
        """Waits for all the outstanding requests to complete and terminates."""
        for a2l in self._a2l_instances:
            a2l.WaitForIdle()
            a2l.Terminate()
 

◆ SymbolizeAsync()

elf_symbolizer.ELFSymbolizer.SymbolizeAsync	(	self,
		addr,
		callback_arg = `None`
	)

Requests symbolization of a given address.

This method is not guaranteed to return immediately. It generally does, but
in some scenarios (e.g. all addr2line instances have full queues) it can
block to create back-pressure.

Args:
addr: address to symbolize.
callback_arg: optional argument which will be passed to the |callback|.

Definition at line 139 of file elf_symbolizer.py.

    def SymbolizeAsync(self, addr, callback_arg=None):
        """Requests symbolization of a given address.
 
    This method is not guaranteed to return immediately. It generally does, but
    in some scenarios (e.g. all addr2line instances have full queues) it can
    block to create back-pressure.
 
    Args:
      addr: address to symbolize.
      callback_arg: optional argument which will be passed to the |callback|."""
        assert (isinstance(addr, int))
 
        # Process all the symbols that have been resolved in the meanwhile.
        # Essentially, this drains all the addr2line(s) out queues.
        for a2l_to_purge in self._a2l_instances:
            a2l_to_purge.ProcessAllResolvedSymbolsInQueue()
            a2l_to_purge.RecycleIfNecessary()
 
        # Find the best instance according to this logic:
        # 1. Find an existing instance with the shortest queue.
        # 2. If all of instances' queues are full, but there is room in the pool,
        #    (i.e. < |max_concurrent_jobs|) create a new instance.
        # 3. If there were already |max_concurrent_jobs| instances and all of them
        #    had full queues, make back-pressure.
 
        # 1.
        def _SortByQueueSizeAndReqID(a2l):
            return (a2l.queue_size, a2l.first_request_id)
 
        a2l = min(self._a2l_instances, key=_SortByQueueSizeAndReqID)
 
        # 2.
        if (a2l.queue_size >= self.max_queue_size and
                len(self._a2l_instances) < self.max_concurrent_jobs):
            a2l = self._CreateNewA2LInstance()
 
        # 3.
        if a2l.queue_size >= self.max_queue_size:
            a2l.WaitForNextSymbolInQueue()
 
        a2l.EnqueueRequest(addr, callback_arg)
 

Member Data Documentation

◆ _a2l_instances

elf_symbolizer.ELFSymbolizer._a2l_instances

protected

Definition at line 125 of file elf_symbolizer.py.

◆ addr2line_path

elf_symbolizer.ELFSymbolizer.addr2line_path

Definition at line 117 of file elf_symbolizer.py.

◆ addr2line_timeout

elf_symbolizer.ELFSymbolizer.addr2line_timeout

Definition at line 123 of file elf_symbolizer.py.

◆ callback

elf_symbolizer.ELFSymbolizer.callback

Definition at line 118 of file elf_symbolizer.py.

◆ disambiguate

elf_symbolizer.ELFSymbolizer.disambiguate

Definition at line 128 of file elf_symbolizer.py.

◆ disambiguation_table

elf_symbolizer.ELFSymbolizer.disambiguation_table

Definition at line 129 of file elf_symbolizer.py.

◆ elf_file_path

elf_symbolizer.ELFSymbolizer.elf_file_path

Definition at line 116 of file elf_symbolizer.py.

◆ inlines

elf_symbolizer.ELFSymbolizer.inlines

Definition at line 119 of file elf_symbolizer.py.

◆ max_concurrent_jobs

elf_symbolizer.ELFSymbolizer.max_concurrent_jobs

Definition at line 120 of file elf_symbolizer.py.

◆ max_queue_size

elf_symbolizer.ELFSymbolizer.max_queue_size

Definition at line 122 of file elf_symbolizer.py.

◆ requests_counter

elf_symbolizer.ELFSymbolizer.requests_counter

Definition at line 124 of file elf_symbolizer.py.

◆ source_root_path

elf_symbolizer.ELFSymbolizer.source_root_path

Definition at line 132 of file elf_symbolizer.py.

◆ strip_base_path

elf_symbolizer.ELFSymbolizer.strip_base_path

Definition at line 130 of file elf_symbolizer.py.

The documentation for this class was generated from the following file:

third_party/dart-lang/sdk/third_party/binary_size/src/elf_symbolizer.py

Classes

Public Member Functions

Public Attributes

Protected Member Functions

Protected Attributes

Detailed Description

Constructor & Destructor Documentation

◆ __init__()

Member Function Documentation

◆ _CreateDisambiguationTable()

◆ _CreateNewA2LInstance()

◆ Join()

◆ SymbolizeAsync()

Member Data Documentation

◆ _a2l_instances

◆ addr2line_path

◆ addr2line_timeout

◆ callback

◆ disambiguate

◆ disambiguation_table

◆ elf_file_path

◆ inlines

◆ max_concurrent_jobs

◆ max_queue_size

◆ requests_counter

◆ source_root_path

◆ strip_base_path

◆ init()