r - Removing elements from nested list -
i have problem i've been working on days , can't find right answer.
i have list needs put mongo database. looks this:
listtest = list( list(section_id = null, name = "name1", slug = "slug1"), list(section_id = null, name = 'name2', slug = 'slug2'), list(section_id = null, name = 'name3', slug = 'slug3', categories = list( list(section_id = null, name = 'name31', slug = 'slug31'), list(section_id = null, name = 'name32', slug = 'slug32', categories = list( list(section_id = null, name = 'name321', slug = 'slug321'), list(section_id = null, name = 'name322', slug = 'slug322'), list(section_id = null, name = 'name323', slug = 'slug323') )), list(section_id = null, name = 'name33', slug = 'slug33', categories = list( list(section_id = null, name = 'name331', slug = 'slug331'), list(section_id = null, name = 'name332', slug = 'slug332'), list(section_id = null, name = 'name333', slug = 'slug333'), list(section_id = null, name = 'name334', slug = 'slug334'), list(section_id = null, name = 'name335', slug = 'slug335') )), list(section_id = null, name = 'name34', slug = 'slug34'), list(section_id = null, name = 'name35', slug = 'slug35', categories = list( list(section_id = null, name = 'name351', slug = 'slug351', categories = list( list(section_id = null, name = 'name3511', slug = 'slug3511'), list(section_id = null, name = 'name3512', slug = 'slug3512'), list(section_id = null, name = 'name3513', slug = 'slug3513') ) ) ) ) ) ) )
the problem have data frame section_ids want put inside nested list based on name or slug. managed this, still leaves me section_ids equal character(0) when there's no secion_id in data frame. how can delete lists section_id equal character(0)? can change section_id read object_id in mongodb? or possible done in json, not in list?
cheers.
i had recreate key table reference.
unique(unlist(listtest, use.names = false)) %>% { data.frame(name_var = .[c(t,f)], slug_var = .[c(f,t)]) %>% mutate(section_id = sample(4678:92456,length(name_var))) %>% select(section_id, name_var, slug_var) } -> key_table
for reference:
> key_table section_id name_var slug_var 1 78002 name1 slug1 2 48508 name2 slug2 3 16510 name3 slug3 4 89004 name31 slug31 5 55853 name32 slug32 6 65886 name321 slug321 7 33987 name322 slug322 8 78071 name323 slug323 9 35349 name33 slug33 10 49888 name331 slug331 11 27138 name332 slug332 12 37065 name333 slug333 13 22600 name334 slug334 14 19481 name335 slug335 15 89434 name34 slug34 16 27822 name35 slug35 17 77680 name351 slug351 18 59848 name3511 slug3511 19 41724 name3512 slug3512 20 48632 name3513 slug3513
so key table, first function matches name variable in list element, , indexes against key table , returns section id.
fn <- function(l, pat = null){ key_table[l[['name']] == key_table[['name_var']],'section_id'] }
then recursively, iterate through list , repeat fn
call on each iteration, substituting section_id name slot matched id key table. in real world, i'm guessing you'd need have na
variable applied in case there no distinct match. should point , can make adjustments needed:
the recursive function: l
list element, in case listtest
f <- function(l){ if("name" %in% names(l)){ l[['section_id']] <- fn(l = l) } else { l <- l } if(is.list(l)){ lapply(l, f) }else { l } }
output:
tojson(f(listtest), auto_unbox = true, pretty = true) [ { "section_id": 78002, "name": "name1", "slug": "slug1" }, { "section_id": 48508, "name": "name2", "slug": "slug2" }, { "section_id": 16510, "name": "name3", "slug": "slug3", "categories": [ { "section_id": 89004, "name": "name31", "slug": "slug31" }, { "section_id": 55853, "name": "name32", "slug": "slug32", "categories": [ { "section_id": 65886, "name": "name321", "slug": "slug321" }, { "section_id": 33987, "name": "name322", "slug": "slug322" }, { "section_id": 78071, "name": "name323", "slug": "slug323" } ] }, { "section_id": 35349, "name": "name33", "slug": "slug33", "categories": [ { "section_id": 49888, "name": "name331", "slug": "slug331" }, { "section_id": 27138, "name": "name332", "slug": "slug332" }, { "section_id": 37065, "name": "name333", "slug": "slug333" }, { "section_id": 22600, "name": "name334", "slug": "slug334" }, { "section_id": 19481, "name": "name335", "slug": "slug335" } ] }, { "section_id": 89434, "name": "name34", "slug": "slug34" }, { "section_id": 27822, "name": "name35", "slug": "slug35", "categories": [ { "section_id": 77680, "name": "name351", "slug": "slug351", "categories": [ { "section_id": 59848, "name": "name3511", "slug": "slug3511" }, { "section_id": 41724, "name": "name3512", "slug": "slug3512" }, { "section_id": 48632, "name": "name3513", "slug": "slug3513" } ] } ] } ] } ]
updated address removal: note: lots of code...
preface: sampled dataset above , created smaller, key_tbl
replicate condition of non-matching id, or name pattern
key_tbl <- sample_frac(key_table, 0.3) #' handles matching key table, , substituting of indexed match table, or replaces `null` `na` fn <- function(l, pat = null){ check <- match(l[['name']], key_tbl$name_var) if(is.na(check)){ na }else { key_tbl[check,'section_id'] } }
the biggest issue recursive iterations handling chain of check if condition met, if is, each sub-iteration, check if first condition met, , evaluate additional conditions; if not met, this..else..do this.. ; step through , find slots didn't meet conditions, , drop those
so here conditionals nested-lists:
#' determine if list , not data.frame array of lists is_list <- function(x){ (!is.data.frame(x) && inherits(x, "list")) } #' forcefull null detection is_null <- function(x){ identical(x, eval(parse(text = typeof(null)))) } #' forecful na detection is_na <- function(x){ if(identical(is.na(x),logical(0)) || is.na(x)){ true }else { false } } #' trick determining empty objects still class-or-object oriented #' , can throw empty return is.empty <- function(x){ if(is.list(x)){ chk <- length(x) }else if(is.character(x)){ chk <- nchar(x) }else if(is.data.frame(x)){ chk <- nrow(x) }else { chk <- 1 } if(chk == 0){ return(true) }else{ return(false) } } #' checks most/all above is.invalid <- function(x){ if(is_null(x)){ return(true) }else if(is_na(x)){ return(true) }else if(!length(x)){ return(true) }else if(is.empty(x)){ return(true) }else { return(false) } } #' vectorized function remove items within list #' meet of #' above tests. drop.invalids <- function(x){ x[!mapply(is.invalid,x)] }
putting together:
f <- function(l){ #' since we're in loop, need ensure there 'name' variable #' match against if('name' %in% names(l)){ #' we've substituted indexed id, or na l[['section_id']] <- fn(l = l) }else { #' didn't thing, since we're not in nested iteration yet. l <- l } #' since nested, , each list named list.. #' check if of items, in each sub-item, na. if(any(mapply(is.na, l[!mapply(is.list, l)]))){ #' each sub item, make sure has children, , #' if so, kids, since we're dropping array #' has na value in it. if(any(mapply(is.list, l))){ l <- l[!mapply(function(x)any(is_na(x)), l)] }else { l <- na } }else { l <- l } #' if it's list, recursively above, #' return good-stuff if(is.list(l)){ drop.invalids(lapply(l,f)) }else { drop.invalids(l) } }
now run it:
> tojson(f(listtest), pretty= true, auto_unbox = true) [ { "section_id": 50274, "name": "name1", "slug": "slug1" }, { "name": "name3", "slug": "slug3", "categories": [ { "name": "name32", "slug": "slug32", "categories": [ { "section_id": 55206, "name": "name323", "slug": "slug323" } ] }, { "name": "name33", "slug": "slug33" }, { "section_id": 92455, "name": "name34", "slug": "slug34" }, { "name": "name35", "slug": "slug35", "categories": [ { "section_id": 7045, "name": "name351", "slug": "slug351", "categories": [ { "section_id": 64549, "name": "name3511", "slug": "slug3511" }, { "section_id": 73196, "name": "name3513", "slug": "slug3513" } ] } ] } ] } ]
Comments
Post a Comment