Questions tagged [tfrecord]

TensorFlow Record Format. A TFRecord file represents a sequence of (binary) strings. The format is not random access, so it is suitable for streaming large amounts of data but not suitable if fast sharding or other non-sequential access is desired.

Filter by
Sorted by
Tagged with
73votes
7answers
51kviews

How to inspect a Tensorflow .tfrecord file?

I have a .tfrecord but I don't know how it is structured. How can I inspect the schema to understand what the .tfrecord file contains? All Stackoverflow answers or documentation seem to assume I know ...
user avatar
61votes
8answers
69kviews

How do I convert a directory of jpeg images to TFRecords file in tensorflow?

I have training data that is a directory of jpeg images and a corresponding text file containing the file name and the associated category label. I am trying to convert this training data into a ...
user avatar
35votes
7answers
40kviews

TensorFlow - Read all examples from a TFRecords at once?

How do you read all examples from a TFRecords at once? I've been using tf.parse_single_example to read out individual examples using code similar to that given in the method read_and_decode in the ...
user avatar
  • 10.5k
31votes
5answers
21kviews

Obtaining total number of records from .tfrecords file in Tensorflow

Is it possible for obtain the total number of records from a .tfrecords file ? Related to this, how does one generally keep track of the number of epochs that have elapsed while training models? While ...
user avatar
29votes
3answers
14kviews

how to store numpy arrays as tfrecord?

I am trying to create a dataset in tfrecord format from numpy arrays. I am trying to store 2d and 3d coordinates. 2d coordinates are numpy array of shape (2,10) of type float64 3d coordinates are ...
user avatar
  • 459
20votes
1answer
27kviews

TensorFlow strings: what they are and how to work with them

When I read file with tf.read_file I get something with type tf.string. Documentation says only that it is "Variable length byte arrays. Each element of a Tensor is a byte array." (https://www....
user avatar
  • 718
18votes
2answers
19kviews

Tensorflow TFRecord: Can't parse serialized example

I am trying to follow this guide in order to serialize my input data into the TFRecord format but I keep hitting this error when trying to read it: InvalidArgumentError: Key: my_key. Can't parse ...
user avatar
  • 12.8k
16votes
1answer
9kviews

Numpy to TFrecords: Is there a more simple way to handle batch inputs from tfrecords?

My question is about how to get batch inputs from multiple (or sharded) tfrecords. I've read the example https://github.com/tensorflow/models/blob/master/inception/inception/image_processing.py#L410. ...
user avatar
  • 3,297
16votes
3answers
3kviews

How to efficiently save a Pandas Dataframe into one/more TFRecord file?

First I want to quickly give some background. What I want to achieve eventually is to train a fully connected neural network for a multi-class classification problem under tensorflow framework. The ...
user avatar
  • 219
13votes
6answers
5kviews

Split .tfrecords file into many .tfrecords files

Is there any way to split .tfrecords file into many .tfrecords files directly, without writing back each Dataset example ?
user avatar
  • 737
13votes
3answers
4kviews

How to visualize a TFRecord?

I was asked this on another forum but thought I'd post it here for anyone that is having trouble with TFRecords. TensorFlow's Object Detection API can produce strange behavior if the labels in the ...
user avatar
10votes
1answer
11kviews

AttributeError: 'Tensor' object has no attribute 'numpy' in Tensorflow 2.1

I am trying to convert the shape property of a Tensor in Tensorflow 2.1 and I get this error: AttributeError: 'Tensor' object has no attribute 'numpy' I already checked that the output of tf....
user avatar
10votes
1answer
2kviews

How to download a sentinel images from google earth engine using python API in tfrecord

While trying to download sentinel image for a specific location, the tif file is generated by default in drive but its not readable by openCV or PIL.Image().Below is the code for the same. If I use ...
user avatar
9votes
2answers
7kviews

TensorFlow - Read video frames from TFRecords file

TLDR; my question is on how to load compressed video frames from TFRecords. I am setting up a data pipeline for training deep learning models on a large video dataset (Kinetics). For this I am using ...
user avatar
9votes
1answer
11kviews

Proper way to iterate tf.data.Dataset in session for 2.0

I have downloaded some *.tfrecord data from the youtube-8m project. You can download a 'small' portion of the data with this command: curl data.yt8m.org/download.py | shard=1,100 partition=2/video/...
user avatar
  • 755
9votes
1answer
13kviews

Numpy array to TFrecord

I'm trying to train a custom dataset through tensorflow object detection api. Dataset contains 40k training images and labels which are in numpy ndarray format (uint8). training dataset shape=2 ([...
user avatar
9votes
1answer
5kviews

Tensorflow: Modern way to load large data

I want to train a convolutional neural network (using tf.keras from Tensorflow version 1.13) using numpy arrays as input data. The training data (which I currently store in a single >30GB '.npz' ...
user avatar
9votes
0answers
2kviews

How to decode Unicode string in Tensorflow's graph pipeline

I have created a tfRecord file to store data. I have to store Hindi text so, I have saved it in the bytes using string.encode('utf-8'). But, I am stuck at the time of reading the data. I am reading ...
user avatar
8votes
2answers
5kviews

How to use Dataset API to read TFRecords file of lists of variant length?

I want to use Tensorflow's Dataset API to read TFRecords file of lists of variant length. Here is my code. def _int64_feature(value): # value must be a numpy array. return tf.train.Feature(...
user avatar
  • 1,711
8votes
3answers
6kviews

Tensorflow object detection API killed - OOM. How to reduce shuffle buffer size?

System information OS Platform and Distribution: CentOS 7.5.1804 TensorFlow installed from: pip install tensorflow-gpu TensorFlow version: tensorflow-gpu 1.8.0 CUDA/cuDNN version: 9.0/7.1.2 GPU model ...
user avatar
  • 81
8votes
2answers
7kviews

Tensorflow/models uses COCO 90 class ids although COCO has only 80 categories

The labelmaps of Tensorflows object_detection project contain 90 classes, although COCO has only 80 categories. Therefore the parameter num_classes in all sample configs is set to 90. If i now ...
user avatar
  • 2,278
8votes
0answers
672views

TensorFlow Example vs SequenceExample

Theres not that much information given in the TensorFlow documentation: https://www.tensorflow.org/api_docs/python/tf/train/Example https://www.tensorflow.org/api_docs/python/tf/train/SequenceExample ...
user avatar
  • 197
7votes
1answer
11kviews

TensorFlow Dataset.shuffle - large dataset [duplicate]

I'm using TensorFlow 1.2 with a dataset in a 20G TFRecord file. There is about half a million samples in that TFRecord file. Looks like if I choose a value smaller than the amount of records in the ...
user avatar
7votes
3answers
3kviews

Tensorflow: Count number of examples in a TFRecord file -- without using deprecated `tf.python_io.tf_record_iterator`

Please read post before marking Duplicate: I was looking for an efficient way to count the number of examples in a TFRecord file of images. Since a TFRecord file does not save any metadata about the ...
user avatar
  • 8,146
7votes
2answers
6kviews

tensorflow ValueError: features should be a dictionary of `Tensor`s. Given type: <class 'tensorflow.python.framework.ops.Tensor'>

This is my code! My tensorflow version is 1.6.0, python version is 3.6.4. If I direct use dataset to read csv file, I can train and no wrong. But I convert csv file to tfrecords file, it's wrong. I ...
user avatar
  • 141
7votes
1answer
2kviews

How to import tfrecord files in a pandas dataframe?

I have a tfrecord file and would like to import it in a pandas dataframe or numpy array. I found tools to read tfrecords but they only work inside a tensorflow session, which is not the use case I ...
user avatar
  • 151
7votes
2answers
6kviews

In TensorFlow 2.0, how to feed TFRecord data to keras model?

I've tried to solve classification problem whose input data having 32 features and 16 labels by Deep Neural Network (DNN). They look like, # Input data shape=(32,), dtype=float32, np.array([-0....
user avatar
  • 435
7votes
0answers
160views

Writing from TF Dataset to tfrecords file

It's really easy to read a TFRecords file into a TF Dataset by using a TFRecordDataset, but is there a similar way to write into a TFRecord file given that I already have the info in a Dataset, or do ...
user avatar
  • 73
6votes
1answer
3kviews

how can I save a string data to TFRecord?

when save to TFRecord, I use: def _int64_feature(value): return tf.train.Feature(int64_list=tf.train.Int64List(value=[value])) def _bytes_feature(value): return tf.train.Feature(bytes_list=...
user avatar
6votes
1answer
4kviews

Read data from TFRecord file used in Object Detection API

I want to read the data stored in a TFRecord file that I've used as a train record in TF Object Detection API. However, I get an InvalidArgumentError: Input to reshape is a tensor with 91090 values, ...
user avatar
6votes
0answers
653views

Tensorflow get_single_element not working with tf.data.TFRecordDataset.batch()

I am trying to perform ZCA whitening on a Tensorflow Dataset. In order to do this, I am trying to extract my data from my Dataset as a Tensor, perform the whitening, then create another Dataset after. ...
user avatar
5votes
1answer
3kviews

Shuffling tfrecords files

I have 5 tfrecords files, one for each object. While training I want to read data equally from all the 5 tfrecords i.e. if my batch size is 50, I should get 10 samples from 1st tfrecord file, 10 ...
user avatar
5votes
1answer
6kviews

Tensorflow 2.0: how to transform from MapDataset (after reading from TFRecord) to some structure that can be input to model.fit

I've stored my training and validation data on two separate TFRecord files, in which I store 4 values: signal A (float32 shape (150,)), signal B (float32 shape (150,)), label (scalar int64), id (...
user avatar
  • 1,092
5votes
1answer
4kviews

Writing and Reading lists to TFRecord example

I want to write a list of integers (or any multidimensional numpy matrix) to one TFRecords example. For both a single value or a list of multiple values I can creates the TFRecord file without error. ...
user avatar
5votes
2answers
302views

Chunk tensorflow dataset records into multiple records

I have an unbatched tensorflow dataset that looks like this: ds = ... for record in ds.take(3): print('data shape={}'.format(record['data'].shape)) -> data shape=(512, 512, 87) -> data ...
user avatar
  • 564
5votes
4answers
10kviews

How to load tfrecord in pytorch?

How to use tfrecord with pytorch? I have downloaded "Youtube8M" datasets with video-level features, but it is stored in tfrecord. I tried to read some sample from these file to convert it to numpy ...
user avatar
  • 451
5votes
3answers
2kviews

TFRecord format for multiple instances of the same or different classes on one training image

I am trying to train a Faster R-CNN on grocery dataset detection using the new Object Detection API, but I do not quite understand the process of creating a TFRecord file for that. I am aware of the ...
user avatar
5votes
3answers
2kviews

How to create multiple TFRecord files instead of making a big one and then splitting it up?

I'm dealing with quite big time series dataset, one that prepared as SequenceExamples is then written to a TFRecord. This results in a quite large file (over 100GB) but I'd like to have it stored in ...
user avatar
  • 427
5votes
2answers
2kviews

How to add class to existing model?

I have trained a model using tensorflow object detection/SSD mobilenet. It works great! I'd like to add a class to it - just to detect pens or something. How can I do this? I have created my image ...
user avatar
  • 5,314
5votes
1answer
607views

Generating TFRecord format data from C+

I'm trying to use TFRecord format to record data from C++ and then use it in python to feed TensorFlow model. TLDR; Simply serializing proto messages into a stream doesn't satisfy .tfrecord format ...
user avatar
  • 1,195
5votes
0answers
151views

tfRecords shown faulty in TF2

I have a couple of own tfrecord file made by myself. They are working perfectly in tf1, I used them in several projects. However if i want to use them in Tensorflow Object Detection API with tf2 (...
user avatar
4votes
2answers
4kviews

Tensorflow Object Detection, error while generating tfrecord [TypeError: None has type NoneType, but expected one of: int, long]

When checking across different solutions available on the net, most people (including datitran) pointed out that it might be a missing class or a misspell of a class in the train csv file. Am not able ...
user avatar
  • 63
4votes
2answers
2kviews

TFrecords occupy more space than original JPEG images

I'm trying to convert my Jpeg image set into to TFrecords. But TFrecord file is taking almost 5x more space than the image set. After a lot of googling, I learned that when JPEG are written into ...
user avatar
4votes
2answers
2kviews

Tensorflow: read variable length data, via Dataset (tfrecord)

Best I would like to read some TF records data. This works, but only for Fixed length data, but now I would like to do the same thing with variable length data VarLenFeature def load_tfrecord_fixed(...
user avatar
  • 2,046
4votes
2answers
3kviews

Write and Read SparseTensor to and from a tfrecord file

Is it possible to do this elegantly? Right now only thing I can think of is to save the indices (tf.int64), values (tf.float32), and shape (tf.int64) of the SparseTensor in 3 separate Features (the ...
user avatar
  • 1,395
4votes
3answers
1kviews

Output TFRecord to Google Cloud Storage from Python

I know tf.python_io.TFRecordWriter has a concept of GCS, but it doesn't seem to have permissions to write to it. If I do the following: output_path = 'gs://my-bucket-name/{}/{}.tfrecord'.format(...
user avatar
  • 2,806
4votes
1answer
4kviews

How do I write an encoded jpeg as bytes to Tensorflow tfrecord and then read it?

I am trying to use tensorflows tfrecords format to store my datasets. I managed to read in jpeg images and decode them to raw format and write them to a tfrecord file. I can then later read them ...
user avatar
  • 91
4votes
1answer
2kviews

Unable to retrain the instance segmentation model

Im trying to train the instance segmentation model. Im using the following code to generate the tfrecord. flags = tf.app.flags flags.DEFINE_string('data_dir', '', 'Root directory to raw pet dataset.'...
user avatar
4votes
2answers
2kviews

writing tfrecord with multithreading is not fast as expected

Tried to write tfrecord w/ and w/o multithreading, and found the speed difference is not much (w/ 4 threads: 434 seconds; w/o multithread 590 seconds). Not sure if I used it correctly. Is there any ...
user avatar
4votes
1answer
2kviews

create tfrecord for object detection task

I'm creating my dataset for a fine tuning task using tensorflow object detection api. My directory structure is : train/ -- imgs/ ---- img1.jpg -- ann/ ---- img1.csv where the csv, one per ...
user avatar

15 30 50 per page
1
2 3 4 5
9