close
The Wayback Machine - https://web.archive.org/web/20260106130013/https://github.com/github/stack-graphs/pull/9
Skip to content
This repository was archived by the owner on Sep 9, 2025. It is now read-only.

Conversation

@dcreager
Copy link
Contributor

@dcreager dcreager commented Jun 3, 2021

This is an implementation of the path stitching algorithm that's divided up into "phases". At the start of each phase, we process whatever (possibly incomplete) paths are currently in the queue. As we extend those paths with partial paths, we queue up the newly extended paths to be processed in the next phase. This phasing approach means that the Database instance doesn't need to be pre-loaded with all of the partial paths that we might ever need to process. Instead, you're notified of each path in the phase before it will be processed, giving you a chance to add its extensions to the database before allowing the next phase to proceed.

Copy link

@rewinfrey rewinfrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks 👍 and really appreciate all the 📝 ! I focused mostly on the append_partial_path code path to better internalize the handling of symbol and scope stacks and the symbol and scope bindings. I left a couple questions and one small request for a bit of additional 📝 . Great work 😄

) -> Handle<PartialPath> {
let start_node = path.start_node;
let symbol_stack_precondition = path.symbol_stack_precondition;
let handle = self.partial_paths.add(path);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there the possibility of loading partial paths concurrently? Or is that something we'd should explicitly not allow based on the add operation? It doesn't look like that would be thread safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's purposefully not safe to load partial paths concurrently, you have to have mut access to the PartialPaths arena. Over in the c-api-* branches I'm working on C wrappers that would let you pass in batches of objects to add to the stack graph. So the Go code will be able to load in a bunch of partial paths in worker goroutines, and then hand them off to a single C call to add them to the Rust code's internal state. For instance:

stack-graphs/src/c.rs

Lines 452 to 467 in dd3bd54

/// Adds new scope stacks to the path arena. `count` is the number of scope stacks you want to
/// create. The content of each scope stack comes from two arrays. The `lengths` array must have
/// `count` elements, and provides the number of scopes in each scope stack. The `scopes` array
/// contains the contents of each of these scope stacks in one contiguous array. Its length must
/// be the sum of all of the counts in the `lengths` array.
///
/// You must also provide an `out` array, which must also have room for `count` elements. We will
/// fill this array in with the `sg_scope_stack` instances for each scope stack that is created.
#[no_mangle]
pub extern "C" fn sg_path_arena_add_scope_stacks(
paths: *mut sg_path_arena,
count: usize,
mut scopes: *const sg_node_handle,
lengths: *const usize,
out: *mut sg_scope_stack,
) {

self.symbols.push_front(&mut db.symbol_stack_keys, symbol);
let handle = self.back_handle();
db.symbol_stack_key_cache.insert(cache_key, handle);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm used to thinking about stacks in terms of "top" and "bottom" of the stack, and have been translating "front" -> "top" and "back" -> "bottom". Is that the right idea?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've been using front and back to describe how they're typically written in our CLI output, where the top of the stack is on the left (i.e. the front of the list of symbols). e.g. in cheese.ints.one, the stack consists of 5 symbols: cheese, ., ints, ., and one. cheese is the top or front of stack (and would be the thing popped off by any pop node we run across during pathfinding). one is bottom or back of stack.


/// Pops a symbol from the back of this symbol stack key.
fn pop_back(&mut self, db: &Database) -> Option<Handle<Symbol>> {
self.symbols.pop_front(&db.symbol_stack_keys).copied()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be self.symbols.pop_back instead of pop_front?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, we're using List under the covers, which only has push/pop_front. That means we're storing the content of the key in reverse order. That ends up not mattering since it's an internal detail; as long as we're consistent throughout the implementation of this type it will all work out. I'll add a clarifying comment to that effect.

dcreager added 3 commits June 28, 2021 11:11
This database maintains internal in-memory indexes to support all of the
lookups that we might need to perform during the path-stitching
algorithm.  Most users will probably have some external storage layer
holding partial paths, in which case you will be responsible for loading
into the `Database` instance all of the partial paths that are valid
extensions of some current path.
Partial paths can have variables in their preconditions and
postconditions.  When trying to append a partial path to a path, you
"match" the partial path's precondition against the path's corresponding
stack.  Any variables in the precondition can "bind" parts of the path's
stack.  You then "apply" those bindings to the partial path's
postcondition.  Any variable references in the postcondition are
substituted with the stack contents that were bound from the
precondition.  The result is the path's new stack after having performed
the concatenation.
This is an implementation of the path stitching algorithm that's divided
up into "phases".  At the start of each phase, we process whatever
(possibly incomplete) paths are currently in the queue.  As we extend
those paths with partial paths, we queue up the newly extended paths to
be processed in the _next_ phase.  This phasing approach means that the
Database instance doesn't need to be pre-loaded with _all_ of the
partial paths that we might ever need to process.  Instead, you're
notified of each path in the phase _before_ it will be processed, giving
you a chance to add its extensions to the database before allowing the
next phase to proceed.
@dcreager dcreager merged commit 373dc95 into main Jun 28, 2021
@dcreager dcreager deleted the path-stitching branch June 28, 2021 15:13
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants