halide - Bilateral Grid Generator class using Enhanced Generator -
i trying re-implement bilateral grid example using enhanced generator class (e.g. using schedule()
, generate()
. i've got error when trying compile code.
g++ -std=c++11 -i ../../include/ -i ../../tools/ -i ../../apps/support/ -g - fno-rtti bilateral_grid_generator.cpp ../../lib/libhalide.a ../../tools/gengen.cpp -o bin/bilateral_grid_exec -ldl -lpthread -lz bin/bilateral_grid_exec -o ./bin target=host generator bilateral_grid has base_path ./bin/bilateral_grid internal error @ /home/xxx/projects/halide/src/generator.cpp:966 triggered user code @ /usr/include/c++/4.8/functional:2057: condition failed: generator make: *** [bin/bilateral_grid.a] aborted (core dumped)
it seems didn't put definition of rdom
, generatorparam
in correct place. since r.x
, r.y
used in both schedule()
, generate()
, think should put class member. should done fix this?
here code wrote.
class bilateralgrid : public halide::generator<bilateralgrid> { public: generatorparam<int> s_sigma{"s_sigma", 8}; //imageparam input{float(32), 2, "input"}; //param<float> r_sigma{"r_sigma"}; input<buffer<float>> input{"input", 2}; input<float> r_sigma{"r_sigma"}; output<buffer<float>> output{"output", 2}; // algorithm description void generate() { //int s_sigma = 8; // add boundary condition clamped(x,y) = boundaryconditions::repeat_edge(input)(x,y); // construct bilateral grid expr val = clamped(x * s_sigma + r.x - s_sigma/2, y * s_sigma + r.y - s_sigma/2); val = clamp(val, 0.0f, 1.0f); expr zi = cast<int>(val * (1.0f/r_sigma) + 0.5f); // histogram histogram(x, y, z, c) = 0.0f; histogram(x, y, zi, c) += select(c == 0, val, 1.0f); // blur grid using five-tap filter blurz(x, y, z, c) = (histogram(x, y, z-2, c) + histogram(x, y, z-1, c)*4 + histogram(x, y, z , c)*6 + histogram(x, y, z+1, c)*4 + histogram(x, y, z+2, c)); blurx(x, y, z, c) = (blurz(x-2, y, z, c) + blurz(x-1, y, z, c)*4 + blurz(x , y, z, c)*6 + blurz(x+1, y, z, c)*4 + blurz(x+2, y, z, c)); blury(x, y, z, c) = (blurx(x, y-2, z, c) + blurx(x, y-1, z, c)*4 + blurx(x, y , z, c)*6 + blurx(x, y+1, z, c)*4 + blurx(x, y+2, z, c)); // take trilinear samples compute output val = clamp(input(x, y), 0.0f, 1.0f); expr zv = val * (1.0f/r_sigma); zi = cast<int>(zv); expr zf = zv - zi; expr xf = cast<float>(x % s_sigma) / s_sigma; expr yf = cast<float>(y % s_sigma) / s_sigma; expr xi = x/s_sigma; expr yi = y/s_sigma; interpolated(x, y, c) = lerp(lerp(lerp(blury(xi, yi, zi, c), blury(xi+1, yi, zi, c), xf), lerp(blury(xi, yi+1, zi, c), blury(xi+1, yi+1, zi, c), xf), yf), lerp(lerp(blury(xi, yi, zi+1, c), blury(xi+1, yi, zi+1, c), xf), lerp(blury(xi, yi+1, zi+1, c), blury(xi+1, yi+1, zi+1, c), xf), yf), zf); // normalize , return output. bilateral_grid(x, y) = interpolated(x, y, 0)/interpolated(x, y, 1); output(x,y) = bilateral_grid(x,y); } // scheduling void schedule() { // int s_sigma = 8; if (get_target().has_gpu_feature()) { // gpu schedule var xi{"xi"}, yi{"yi"}, zi{"zi"}; // schedule blurz in 8x8 tiles. tile in // grid-space, means represents // 64x64 pixels in input (if s_sigma 8). blurz.compute_root().reorder(c, z, x, y).gpu_tile(x, y, xi, yi, 8, 8); // schedule histogram happen per-tile of blurz, // intermediate results in shared memory. means histogram // , blurz makes three-stage kernel: // 1) 0 out 8x8 set of histograms // 2) compute histogram iterating on lots of input image // 3) blur set of histograms in z histogram.reorder(c, z, x, y).compute_at(blurz, x).gpu_threads(x, y); histogram.update().reorder(c, r.x, r.y, x, y).gpu_threads(x, y).unroll(c); // alternative schedule histogram doesn't use shared memory: // histogram.compute_root().reorder(c, z, x, y).gpu_tile(x, y, xi, yi, 8, 8); // histogram.update().reorder(c, r.x, r.y, x, y).gpu_tile(x, y, xi, yi, 8, 8).unroll(c); // schedule remaining blurs , sampling @ end similarly. blurx.compute_root().gpu_tile(x, y, z, xi, yi, zi, 8, 8, 1); blury.compute_root().gpu_tile(x, y, z, xi, yi, zi, 8, 8, 1); bilateral_grid.compute_root().gpu_tile(x, y, xi, yi, s_sigma, s_sigma); } else { // cpu schedule. blurz.compute_root().reorder(c, z, x, y).parallel(y).vectorize(x, 8).unroll(c); histogram.compute_at(blurz, y); histogram.update().reorder(c, r.x, r.y, x, y).unroll(c); blurx.compute_root().reorder(c, x, y, z).parallel(z).vectorize(x, 8).unroll(c); blury.compute_root().reorder(c, x, y, z).parallel(z).vectorize(x, 8).unroll(c); bilateral_grid.compute_root().parallel(y).vectorize(x, 8); } } func clamped{"clamped"}, histogram{"histogram"}; func bilateral_grid{"bilateral_grid"}; func blurx{"blurx"}, blury{"blury"}, blurz{"blurz"}, interpolated{"interpolated"}; var x{"x"}, y{"y"}, z{"z"}, c{"c"}; rdom r{0, s_sigma, 0, s_sigma}; }; //halide::registergenerator<bilateralgrid> register_me{"bilateral_grid"}; halide_register_generator(bilateralgrid, "bilateral_grid"); } // namespace
the error here subtle, , current assertion failure message regrettably unhelpful.
the problem here code using generatorparam
(s_sigma) initialize member-variable-rdom
(r), generatorparam
may not have final value set @ point. speaking, accessing generatorparam
(or scheduleparam
) before generate()
method called produce such assert.
why this? let's @ way generators created , initialized in typical build system:
- gengen.cpp creates instance of generator's c++ class; naturally, executes c++ constructor, c++ constructors member variables, in order of declaration.
- gengen.cpp uses arguments provided on command line override default values of generatorparams. example, if had invoked generator
bin/bilateral_grid_exec -o ./bin target=host s_sigma=7
, default value (8) stored ins_sigma
replaced 7. - gengen.cpp calls
generate()
,schedule()
, compiles result .o (or .a, etc).
so why seeing assert? what's happening in code in step 1 above, ctor r
being run in step 1... arguments ctor r
read current value s_sigma
, has default value (8), not value specified build file. if allowed read happen without asserting, inconsistent values s_sigma
in different parts of generator.
you can fix deferring initialization of rdom generate()
method:
class bilateralgrid : public halide::generator<bilateralgrid> { public: generatorparam<int> s_sigma{"s_sigma", 8}; ... void generate() { r = rdom(0, s_sigma, 0, s_sigma); ... } ... private: rdom r; };
(obviously, assertion failure needs more helpful error message; i'll modify code so.)
Comments
Post a Comment