Serene Runtime 1.0.0
C runtime for the Serene programming language
Loading...
Searching...
No Matches
seq.h File Reference

This is an implementation of bit - partitioned, persistent, immutable sequence For more information, have a look at Ideal hash trees paper. More...

#include <stddef.h>
#include <stdint.h>
#include "serene/rt/errors.h"
Include dependency graph for seq.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Data Structures

struct  seq_lookup_result_t
 
struct  seq_node_t
 We have two type of node that both are implemented using the same data structure. More...
 
struct  seq_t
 

Macros

#define SEQ_BR   32u
 branching factor (power of two)
 
#define SEQ_SHIFT   5u
 log2(SEQ_BR)
 
#define SEQ_MASK   (SEQ_BR - 1u)
 
#define SEQ_MAX_DEPTH   11
 logical length.
 

Typedefs

typedef void * seq_elem_t
 We use generic pointers to refer to internal nodes, leaf nodes and even elements.
 
typedef struct seq_lookup_result_t seq_lookup_result_t
 
typedef struct seq_node_t seq_node_t
 We have two type of node that both are implemented using the same data structure.
 
typedef struct seq_t seq_t
 

Functions

seq_t seq_empty (const srn_context_t *ctx)
 
seq_t seq_push (const srn_context_t *ctx, const seq_t *seq, seq_elem_t x)
 
seq_lookup_result_t seq_get (const srn_context_t *ctx, const seq_t *seq, size_t n)
 Negative index is not supported.
 

Detailed Description

This is an implementation of bit - partitioned, persistent, immutable sequence For more information, have a look at Ideal hash trees paper.

TL;DR:

  • Every struct is immutable (from user perspective).
  • Nodes are persistent and immutable.
  • Nodes are heap allocated
  • Heads can be stack allocated
  • We create a new head for any change
  • Heads contain a tail buffer that will be converted to a leaf node when they are full
  • This data structure is NOT effecient for extracting sub sequences
  • This data structure is effecient for tail operations,
  • If you need a subset of the elments in the sequence, create a list view out of them instead of extracting them as a new seq.
  • Deleting operation is O(N)

TODO:

  • Create map and filter function or a generic fold
  • Create a helper function to help creating list view from the seq

Definition in file seq.h.

Macro Definition Documentation

◆ SEQ_BR

#define SEQ_BR   32u

branching factor (power of two)

Definition at line 112 of file seq.h.

◆ SEQ_MASK

#define SEQ_MASK   (SEQ_BR - 1u)

Definition at line 116 of file seq.h.

◆ SEQ_MAX_DEPTH

#define SEQ_MAX_DEPTH   11

logical length.

While techically this implementation will support up to 2^85 elements in each seq, but we will limit it down to (2^64-1)(UINT64_MAX)(on 64bit machines) in order to keep the seq_t as small as possible. 12 is the magic number that we use to limit the depth to. 32^11 = 1,152,921,504,606,846,976 elements

Definition at line 123 of file seq.h.

◆ SEQ_SHIFT

#define SEQ_SHIFT   5u

log2(SEQ_BR)

Definition at line 115 of file seq.h.

Typedef Documentation

◆ seq_elem_t

typedef void* seq_elem_t

We use generic pointers to refer to internal nodes, leaf nodes and even elements.

It makes the calculation cruical to determining what type of data we are looking at, in each node. seq_elem_t will be pointing to actual user data when the node is a leaf node (depth == 0) and it will be pointing to the next node in the trie if the node is an inner node (depth != 0).

Definition at line 131 of file seq.h.

◆ seq_lookup_result_t

typedef struct seq_lookup_result_t seq_lookup_result_t

◆ seq_node_t

typedef struct seq_node_t seq_node_t

We have two type of node that both are implemented using the same data structure.

Inner nodes that point to other inner nodes or leaf nodes, and leaf nodes which points to actual elements of the sequence.

The main factor in determining the nature of the node is the depth of the trie. Depth zero, means a leaf node and an inner node otherwise.

◆ seq_t

typedef struct seq_t seq_t

Function Documentation

◆ seq_empty()

seq_t seq_empty ( const srn_context_t * ctx)
nodiscard

Definition at line 76 of file seq.c.

76 {
77 seq_t seq;
78 seq.len = 0;
79 seq.tail_len = 0;
80 seq.tail = seq_create_page(ctx);
81 seq.maybe_error = nullptr;
82 seq.root = nullptr;
83 seq.depth = 0;
84 return seq;
85}
static seq_elem_t * seq_create_page(const srn_context_t *ctx)
Create a PAGE (just slots for the data)
Definition seq.c:40
Definition seq.h:155
size_t len
logical length.
Definition seq.h:161
seq_node_t * root
NULL means “all data is in tail”
Definition seq.h:167
seq_elem_t * tail
small tail array for fast push/pop.
Definition seq.h:170
uint8_t depth
tree depth in levels (0 == leaf level)
Definition seq.h:165
uint16_t tail_len
0..SEQ_BR
Definition seq.h:163
Here is the call graph for this function:
Here is the caller graph for this function:

◆ seq_get()

seq_lookup_result_t seq_get ( const srn_context_t * ctx,
const seq_t * seq,
size_t n )
nodiscard

Negative index is not supported.

Definition at line 173 of file seq.c.

173 {
174 seq_lookup_result_t result = {.data = nullptr, .maybe_error = nullptr};
175
176 if (n >= seq->len) {
177 char *err_msg = (char *)ALLOC(ctx, size_t);
178 // +1 is due to null terminated strings
179 SNPRINTF(err_msg, sizeof(size_t) + 1, "%zu", n);
180
181 result.maybe_error = ERR(ctx, INDEX_OUT_OF_BOUND, err_msg);
182 result.data = nullptr;
183 return result;
184 }
185
186 size_t tail_start = seq->len - (size_t)seq->tail_len;
187 if (n >= tail_start) {
188 // Tail lookup
189 size_t off = n - tail_start; // 0 .. tail_len-1
190 result.data = seq->tail[off];
191 return result;
192 }
193
194 SEQ_LOG("Looking up in trie: TL: %d, L: %zu N: %zu", seq->tail_len, seq->len, n);
195 seq_node_t *node = seq->root;
196 for (int16_t d = seq->depth; d >= 0; d--) {
197 uint16_t index = seq_depth_index(d, n);
198 SEQ_LOG("Looking up in trie: D: %d, I: %hu", d, index);
199
200 if (d == 0) {
201 SEQ_LOG("We're at depth zero");
202 result.data = node->children[index];
203 return result;
204 }
205
206 node = seq_next_child(ctx, node, index);
207 if (!node) {
208 result.maybe_error = ERR(ctx, CORRUPTED_SEQ, "missing child");
209 result.data = nullptr;
210 return result;
211 }
212 }
213
214 PANIC("It should never happen");
215 // just a dummy return
216 result.data = nullptr;
217 result.maybe_error = ERR(ctx, ABSURD, "");
218 return result;
219}
int n
Definition acutest.h:538
#define ALLOC(ctx, T)
Definition context.h:82
@ CORRUPTED_SEQ
When the memory layout of a seq is incorrect.
Definition errors.h:48
@ ABSURD
Definition errors.h:45
@ INDEX_OUT_OF_BOUND
Definition errors.h:47
#define ERR(ctx, err, msg)
Definition errors.h:74
static seq_node_t * seq_next_child(const srn_context_t *ctx, seq_node_t *n, uint16_t index)
Get or create the next node of the tree.
Definition seq.c:69
static uint16_t seq_depth_index(const uint8_t depth, const uint64_t index)
Find the array index of the given index, considering the given depth.
Definition seq.c:35
#define SEQ_LOG(FMT,...)
Definition seq.c:30
seq_elem_t data
Definition seq.h:135
We have two type of node that both are implemented using the same data structure.
Definition seq.h:147
seq_elem_t * children
We allocate children to be a buffer of SEQ_BR number of pointers.
Definition seq.h:149
#define SNPRINTF(buf, s, f,...)
Definition utils.h:66
#define PANIC(msg)
Definition utils.h:51
Here is the call graph for this function:
Here is the caller graph for this function:

◆ seq_push()

seq_t seq_push ( const srn_context_t * ctx,
const seq_t * seq,
seq_elem_t x )
nodiscard

Definition at line 87 of file seq.c.

87 {
88 if (seq->len >= SIZE_MAX) {
89 seq_t failed_seq;
90 failed_seq.maybe_error =
91 ERR(ctx, SEQ_LIMIT_REACHED, "Can't fit more elements in this sequence");
92 failed_seq.tail_len = 0;
93 failed_seq.len = 0;
94 return failed_seq;
95 }
96
97 if (seq->tail_len < SEQ_BR) {
98 // Tail buffer has enough space
99 seq_t new_seq = *seq;
100 new_seq.maybe_error = nullptr;
101 // new_seq.tail_len++;
102 new_seq.len++;
103 new_seq.tail[new_seq.tail_len++] = x;
104 SEQ_LOG("Tail has space, TL: %d, L: %zu, D: %d", new_seq.tail_len, new_seq.len, new_seq.depth);
105 return new_seq;
106 }
107
108 // elements already in the tree (multiple of SEQ_BR)
109 size_t sz_without_tail = seq->len - (size_t)seq->tail_len;
110 // Build a leaf will own the current tail buffer.
111 seq_node_t *leaf = ALLOC(ctx, seq_node_t);
112 leaf->children = seq->tail;
113
114 // Tail buffer is full and we need to to move it (no copy) inside the trie
115 // and allocate a new tail
116 if (seq->root == nullptr) {
117 // This happens only once, at index
118 // (SEQ_BR + 1) and technically we can just
119 seq_t new_seq = *seq;
120 new_seq.maybe_error = nullptr;
121 new_seq.tail_len = 1;
122 new_seq.len++;
123 new_seq.tail = seq_create_page(ctx);
124 new_seq.tail[new_seq.tail_len - 1] = x;
125 new_seq.root = leaf;
126 new_seq.depth = 0;
127 SEQ_LOG("Root is null, TL: %d, L: %zu, D: %d", new_seq.tail_len, new_seq.len, new_seq.depth);
128
129 return new_seq;
130 }
131 // Root is not Null, so we have inner children and we need to find the slot
132 // which we need to move the tail into
133
134 // If the tree is full at current depth, grow a new root
135 if (sz_without_tail >> ((seq->depth + 1) * SEQ_SHIFT)) {
136 // ^^ == floor(sz_without_tail / 2^(5*(d+1)))
137
138 seq_node_t *new_root = seq_new_node(ctx); // internal node
139 new_root->children[0] = seq->root;
140 new_root->children[1] = seq_new_path(ctx, seq->depth, leaf);
141 seq_t new_seq = *seq;
142 new_seq.root = new_root;
143 new_seq.depth = seq->depth + 1;
144 new_seq.tail = seq_create_page(ctx);
145 new_seq.tail_len = 1;
146 new_seq.len = seq->len + 1;
147 new_seq.tail[0] = x;
148 return new_seq;
149 }
150 // Otherwise insert the leaf at the right fringe under the parent (d==1)
151 seq_node_t *parent = seq->root;
152
153 for (uint8_t d = seq->depth; d > 1; --d) {
154 uint16_t idx = seq_depth_index(d, (uint64_t)sz_without_tail);
155 // ensures internal nodes exist
156 parent = seq_next_child(ctx, parent, idx);
157 }
158
159 // Now at d == 1: use the correct child slot to insert the leaf
160 uint16_t leaf_parent_idx = seq_depth_index(1, (uint64_t)sz_without_tail);
161 parent->children[leaf_parent_idx] = (seq_elem_t)leaf;
162
163 // Start a fresh tail with the pushed element
164 seq_t new_seq = *seq;
165 new_seq.tail = seq_create_page(ctx);
166 new_seq.tail_len = 1;
167 new_seq.len = seq->len + 1;
168 new_seq.tail[0] = x;
169 return new_seq;
170}
@ SEQ_LIMIT_REACHED
Definition errors.h:46
static seq_node_t * seq_new_node(const srn_context_t *ctx)
Definition seq.c:49
static seq_node_t * seq_new_path(const srn_context_t *ctx, uint8_t depth, seq_node_t *leaf)
Walk from the root to a possible leaf node and create all the inner and leaf nodes if necessary.
Definition seq.c:58
#define SEQ_BR
branching factor (power of two)
Definition seq.h:112
#define SEQ_SHIFT
log2(SEQ_BR)
Definition seq.h:115
void * seq_elem_t
We use generic pointers to refer to internal nodes, leaf nodes and even elements.
Definition seq.h:131
Here is the call graph for this function:
Here is the caller graph for this function: