Sort input/output in PreProcessPrediction #1638

zhjunqin · 2020-05-22T04:29:44Z

In direct_session.cc https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/common_runtime/direct_session.cc#L1514, it always emplaces key to executors_, then a lot of keys are added to map, which leads to a lot of memory usage.

If 10 input tensors, then there are 10! = 3,628,800‬ kinds of keys, memory usage is very big.

  // See if we already have the executors for this run.
  {
    mutex_lock l(executor_lock_);
    auto it = executors_.find(sorted_key);
    if (it != executors_.end()) {
      *executors_and_keys = it->second.get();

      // Insert this under the original key.  
      executors_.emplace(key, it->second); 
      return Status::OK();
    }
  }

please check attached file.
serving_nommap.0042.heap.base0007.pdf

Also check issue #1215

I'm not sure fix TF code or TF serving code would be better, so I submitted another PR tensorflow/tensorflow#39743, please help check.

algorithmdog · 2020-05-26T03:47:08Z

Thank you. We used the code from this pr, and solved the out-memory problem in our production system.

netfs · 2020-05-29T18:46:00Z

Thanks for the bug report and the PR.

I think this is best fixed in (TF) direct_session (tensorflow/tensorflow#39743) than in TF serving.

Though I am wondering, why do you have so many keys in your setup. If the input/output ordering is kept consistent across requests, we should not have these many keys. no?

zhjunqin · 2020-05-30T08:24:13Z

Thanks for the bug report and the PR.

I think this is best fixed in (TF) direct_session (tensorflow/tensorflow#39743) than in TF serving.

Though I am wondering, why do you have so many keys in your setup. If the input/output ordering is kept consistent across requests, we should not have these many keys. no?

I think I didn't make it clear, the root cause is the inputs map in PredictRequest is not an order map.

message PredictRequest {
  // Model Specification. If version is not specified, will use the latest
  // (numerical) version.
  ModelSpec model_spec = 1;

  // Input tensors.
  // Names of input tensor are alias names. The mapping from aliases to real
  // input tensor names is stored in the SavedModel export as a prediction
  // SignatureDef under the 'inputs' field.
  map<string, TensorProto> inputs = 2;

  // Output filter.
  // Names specified are alias names. The mapping from aliases to real output
  // tensor names is stored in the SavedModel export as a prediction
  // SignatureDef under the 'outputs' field.
  // Only tensors specified here will be run/fetched and returned, with the
  // exception that when none is specified, all tensors specified in the
  // named signature will be run/fetched and returned.
  repeated string output_filter = 3;
}

Then the inputs in PredictRequest could be any order in function RunPredict, even the reqeust sent in GRPC message is same.

Status RunPredict(const RunOptions& run_options,
                  const MetaGraphDef& meta_graph_def,
                  const optional<int64>& servable_version, Session* session,
                  const PredictRequest& request, PredictResponse* response) {
  // Validate signatures.
  const string signature_name = request.model_spec().signature_name().empty()
                                    ? kDefaultServingSignatureDefKey
                                    : request.model_spec().signature_name();
  auto iter = meta_graph_def.signature_def().find(signature_name);
  if (iter == meta_graph_def.signature_def().end()) {
    return errors::FailedPrecondition(strings::StrCat(
        "Serving signature key \"", signature_name, "\" not found."));
  }
  SignatureDef signature = iter->second;

  MakeModelSpec(request.model_spec().name(), signature_name, servable_version,
                response->mutable_model_spec());

  std::vector<std::pair<string, Tensor>> input_tensors;
  std::vector<string> output_tensor_names;
  std::vector<string> output_tensor_aliases;
  TF_RETURN_IF_ERROR(PreProcessPrediction(signature, request, &input_tensors,
                                          &output_tensor_names,
                                          &output_tensor_aliases));

zhjunqin · 2020-05-30T08:56:21Z

For example:

map<string, TensorProto> inputs = 2;

There are 3 inputs, "feature1", "feature2" and "feature3" in request.inputs , but the iteration order could be different even send same GRPC message.

  for (auto& input : request.inputs()) {
    const string& alias = input.first;
    std::cout << alias << std::endl;
  }

googlebot added the cla: yes label May 22, 2020

zhjunqin mentioned this pull request May 22, 2020

fix GetOrCreateExecutors executors_.emplace a lot of keys to executors_ tensorflow/tensorflow#39743

Closed

zhjunqin force-pushed the add-sort-to-signature branch from 88770fd to a0c8875 Compare May 22, 2020 04:41

xixiddd mentioned this pull request May 27, 2020

Sharp increase in memory usage -> server is killed #1215

Closed

Sort input in PreProcessPrediction

7916b6d

zhjunqin force-pushed the add-sort-to-signature branch from a0c8875 to 7916b6d Compare November 4, 2020 06:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sort input/output in PreProcessPrediction #1638

Sort input/output in PreProcessPrediction #1638

Uh oh!

zhjunqin commented May 22, 2020 •

edited

Loading

algorithmdog commented May 26, 2020

netfs commented May 29, 2020

zhjunqin commented May 30, 2020 •

edited

Loading

zhjunqin commented May 30, 2020 •

edited

Loading

Sort input/output in PreProcessPrediction #1638

Are you sure you want to change the base?

Sort input/output in PreProcessPrediction #1638

Uh oh!

Conversation

zhjunqin commented May 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

algorithmdog commented May 26, 2020

netfs commented May 29, 2020

zhjunqin commented May 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

zhjunqin commented May 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

zhjunqin commented May 22, 2020 •

edited

Loading

zhjunqin commented May 30, 2020 •

edited

Loading

zhjunqin commented May 30, 2020 •

edited

Loading